Skyfall-GS: Synthesizing Immersive 3D Urban Scenes from Satellite Imagery

Jie-Ying Lee · Yi-Ruei Liu · Shr-Ruei Tsai · Wei-Cheng Chang · Chung-Ho Wu
Jiewen Chan · Zhenjun Zhao · Chieh Hubert Lin · Yu-Lun Liu

🌐 Project Page | 📄 Paper

Synthesizing large-scale, explorable, and geometrically accurate 3D urban scenes is a challenging yet valuable task in providing immersive and embodied applications. The challenges lie in the lack of large-scale and high-quality real-world 3D scans for training generalizable generative models. In this paper, we take an alternative route to create large-scale 3D scenes by synergizing the readily available satellite imagery that supplies realistic coarse geometry and the open-domain diffusion model for creating high-quality close-up appearances. We propose Skyfall-GS, the first large-scale 3D-scene creation framework without costly 3D annotations, also featuring real-time, immersive 3D exploration. We tailor a curriculum-driven iterative refinement strategy to progressively enhance geometric completeness and photorealistic textures. Extensive experiments demonstrate that Skyfall-GS provides improved cross-view consistent geometry and more realistic textures compared to state-of-the-art approaches.

Installation

Clone the repository:

git clone --recurse-submodules https://github.com/jayin92/Skyfall-GS.git
cd Skyfall-GS

Create and activate a Conda environment:

conda create -y -n skyfall-gs python=3.10
conda activate skyfall-gs

Install dependencies:

conda install cuda-toolkit=12.8 cuda-nvcc=12.8 -c nvidia

pip install -r requirements.txt

pip install --force-reinstall torch torchvision torchaudio

pip install submodules/diff-gaussian-rasterization-depth
pip install submodules/simple-knn
pip install submodules/fused-ssim

Dataset

The datasets required to train the Skyfall-GS model should be placed in the data/ directory.

Downloading the Datasets

The JAX and NYC datasets are available for download from Google Drive.

Download the zip files:

Download from Google Drive

Unzip the datasets into the data/ directory:

unzip datasets_JAX.zip
unzip datasets_NYC.zip

Directory Structure

After unzipping, the directory structure inside the data/ directory should look like this:

data/
├── datasets_JAX/
│   ├── JAX_004
│   ├── JAX_068
│   └── ...
└── datasets_NYC/
    ├── NYC_004
    ├── NYC_010
    └── ...

Training

The training process is divided into two main stages.

Stage 1: Reconstruction

This stage focuses on reconstructing the initial 3D scene from satellite imagery.

python train.py \
    -s ./data/datasets_JAX/JAX_068/ \
    -m ./outputs/JAX/JAX_068 \
    --eval \
    --port 6209 \
    --kernel_size 0.1 \
    --resolution 1 \
    --sh_degree 1 \
    --appearance_enabled \
    --lambda_depth 0 \
    --lambda_opacity 10 \
    --densify_until_iter 21000 \
    --densify_grad_threshold 0.0001 \
    --lambda_pseudo_depth 0.5 \
    --start_sample_pseudo 1000 \
    --end_sample_pseudo 21000 \
    --size_threshold 20 \
    --scaling_lr 0.001 \
    --rotation_lr 0.001 \
    --opacity_reset_interval 3000 \
    --sample_pseudo_interval 10

Stage 2: Synthesis with Iterative Dataset Update (IDU)

This stage refines the geometry and synthesizes high-quality textures using an iterative dataset update strategy. This stage uses a pretrained model from Stage 1.

python train.py \
    -s ./data/datasets_JAX/JAX_068/ \
    -m ./outputs/JAX_idu/JAX_068 \
    --start_checkpoint ./outputs/JAX/JAX_068/chkpnt30000.pth \
    --iterative_datasets_update \
    --eval \
    --port 6209 \
    --kernel_size 0.1 \
    --resolution 1 \
    --sh_degree 1 \
    --appearance_enabled \
    --lambda_depth 0 \
    --lambda_opacity 0 \
    --idu_opacity_reset_interval 5000 \
    --idu_refine \
    --idu_num_samples_per_view 2 \
    --densify_grad_threshold 0.0002 \
    --idu_num_cams 6 \
    --idu_use_flow_edit \
    --idu_render_size 1024 \
    --idu_flow_edit_n_min 4 \
    --idu_flow_edit_n_max 10 \
    --idu_grid_size 3 \
    --idu_grid_width 512 \
    --idu_grid_height 512 \
    --idu_episode_iterations 10000 \
    --idu_iter_full_train 0 \
    --idu_opacity_cooling_iterations 500 \
    --lambda_pseudo_depth 0.5 \
    --idu_densify_until_iter 9000 \
    --idu_train_ratio 0.75

Automated Training Scripts

The scripts/ directory contains scripts for automated training on different datasets and configurations.

scripts/run_jax.py: Runs Stage 1 training for the JAX dataset scenes.
scripts/run_jax_idu.py: Runs Stage 2 (IDU) training for the JAX dataset scenes.
scripts/run_jax_naive.py: Runs a naive training for the JAX dataset scenes without advanced features.
scripts/run_nyc.py: Runs Stage 1 training for the NYC dataset scenes.
scripts/run_nyc_idu.py: Runs Stage 2 (IDU) training for the NYC dataset scenes.
scripts/run_nyc_naive.py: Runs a naive training for the NYC dataset scenes.

Evaluation

The eval.py script is used for evaluating the performance of a trained model. It computes various metrics by comparing the rendered images with ground truth images.

Downloading Evaluation Data

The evaluation data, which includes the ground truth videos and the rendered videos from other methods, can be downloaded from Google Drive.

Download from Google Drive

After downloading, unzip the file and place the results_eval directory in the root of the project.

Usage

python eval.py \
    --data_dir results_eval/data_eval_JAX \
    --temp_dir temp_frames_JAX \
    --methods mip-splatting sat-nerf eogs corgs ours_stage1 ours_stage2 \
    --output_file metrics_results_JAX.csv \
    --frame_rate 30 \
    --resolution 1024 \
    --batch_size 64 

python eval.py \
    --data_dir results_eval/data_eval_NYC \
    --temp_dir temp_frames_NYC \
    --methods citydreamer gaussiancity corgs ours_stage1 ours_stage2 \
    --output_file metrics_results_NYC.csv \
    --frame_rate 24 \
    --no_resize \
    --batch_size 64

The script calculates the following metrics:

PSNR: Peak Signal-to-Noise Ratio
SSIM: Structural Similarity Index
LPIPS: Learned Perceptual Image Patch Similarity
CLIP-FID: FID score calculated using CLIP features
CMMD: CLIP Maximum Mean Discrepancy

Rendering and Visualization

The render_video.py script can be used to render a video from a trained model using a specified camera path.

python render_video.py \
    -m <path_to_model_directory> \
    --camera_path <path_to_camera.json> \
    --load_from_checkpoints \
    --iteration <checkpoint_iteration> \
    --save_images \
    --depth

-m: Path to the model directory.
--camera_path: Path to the camera trajectory JSON file.
--load_from_checkpoints: Load the model from a checkpoint.
--iteration: The checkpoint iteration to use.
--save_images: Save individual frames of the video.
--depth: Render depth maps instead of RGB images.

You can also render a video from a .ply file using render_video_from_ply.py:

python render_video_from_ply.py \
    --ply_path <path_to_ply_file> \
    --camera_path <path_to_camera.json>

Online Viewer

After training, you can create a fused PLY file for visualization in an online viewer.

Generate the fused PLY file:

python create_fused_ply.py -m ./output_stage2/JAX_068 --output_ply fused/JAX_068_fused.ply --iteration 80000 --load_from_checkpoints

Visualize in the online viewer:

Use the online viewer to visualize the trained model. For optimal viewing, use the following settings:
- Up vector: 0,0,1
- SH degree: 1
- Camera origin: 0,0,200

Useful Scripts

This project includes several other useful scripts:

align_ges.py: Find optimal target altitude by comparing with ground truth.
convert.py: A COLMAP converter script.
dsmr.py: Functions for DSM registration.
evaluate_gs_geometry.py: Evaluate geometry accuracy for a single scene.
gen_render_path.py: Generate a camera path for an orbit view around a target point.
render_videos.py: A script for batch rendering of videos from multiple models and camera paths.
sat_utils.py: Utility functions for handling satellite images and georeferenced data.
scripts/merge_images.py: Merge two frames into one.

Acknowledgement

This codebase is built upon the following open-source projects:

We thank the authors for their contributions.

This research was funded by the National Science and Technology Council, Taiwan, under Grants NSTC 112-2222-E-A49-004-MY2 and 113-2628-EA49-023-. The authors are grateful to Google, NVIDIA, and MediaTek Inc. for their generous donations. Yu-Lun Liu acknowledges the Yushan Young Fellow Program by the MOE in Taiwan.

Citation

If you find this work useful, please consider citing:

@article{lee2025SkyfallGS,
  title = {{Skyfall-GS}: Synthesizing Immersive {3D} Urban Scenes from Satellite Imagery},
  author = {Jie-Ying Lee and Yi-Ruei Liu and Shr-Ruei Tsai and Wei-Cheng Chang and Chung-Ho Wu and Jiewen Chan and Zhenjun Zhao and Chieh Hubert Lin and Yu-Lun Liu},
  journal = {arXiv preprint},
  year = {2025},
  eprint = {2510.15869},
  archivePrefix = {arXiv}
}

License

This project is licensed under the terms of the Apache 2 License.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Skyfall-GS: Synthesizing Immersive 3D Urban Scenes from Satellite Imagery

🌐 Project Page | 📄 Paper

Table of Contents

Installation

Dataset

Downloading the Datasets

Directory Structure

Training

Stage 1: Reconstruction

Stage 2: Synthesis with Iterative Dataset Update (IDU)

Automated Training Scripts

Evaluation

Downloading Evaluation Data

Usage

Rendering and Visualization

Online Viewer

Useful Scripts

Acknowledgement

Citation

License

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
arguments		arguments
assets		assets
camera_paths		camera_paths
cmmd_pytorch		cmmd_pytorch
data		data
gaussian_renderer		gaussian_renderer
lpipsPyTorch		lpipsPyTorch
results_eval		results_eval
scene		scene
scripts		scripts
submodules		submodules
utils		utils
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
LICENSE_inria.md		LICENSE_inria.md
README.md		README.md
align_ges.py		align_ges.py
convert.py		convert.py
create_fused_ply.py		create_fused_ply.py
dsmr.py		dsmr.py
eval.py		eval.py
evaluate_gs_geometry.py		evaluate_gs_geometry.py
gen_render_path.py		gen_render_path.py
render_video.py		render_video.py
render_video_from_ply.py		render_video_from_ply.py
render_videos.py		render_videos.py
requirements.txt		requirements.txt
sat_utils.py		sat_utils.py
train.py		train.py

License

jayin92/Skyfall-GS

Folders and files

Latest commit

History

Repository files navigation

Skyfall-GS: Synthesizing Immersive 3D Urban Scenes from Satellite Imagery

🌐 Project Page | 📄 Paper

Table of Contents

Installation

Dataset

Downloading the Datasets

Directory Structure

Training

Stage 1: Reconstruction

Stage 2: Synthesis with Iterative Dataset Update (IDU)

Automated Training Scripts

Evaluation

Downloading Evaluation Data

Usage

Rendering and Visualization

Online Viewer

Useful Scripts

Acknowledgement

Citation

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages