Table of Contents
Fetching ...

Enhanced 3D Urban Scene Reconstruction and Point Cloud Densification using Gaussian Splatting and Google Earth Imagery

Kyle Gao, Dening Lu, Hongjie He, Linlin Xu, Jonathan Li

TL;DR

The paper tackles large-scale urban 3D reconstruction and photorealistic view synthesis from Google Earth imagery. It proposes 3D Gaussian Splatting, with a differentiable rasterizer and per-Gaussian lighting via spherical harmonics, to achieve fast, photorealistic novel-view synthesis and dense geometry densification in remote sensing scenarios. Results on the Waterloo region and BungeeNeRF-style city scenes show that 3DGS can surpass NeRF-based methods in visual quality while offering substantially faster training times, albeit with higher memory demands and some geometric misalignment relative to COLMAP MVS benchmarks. The work highlights the potential of satellite/aerial imagery-driven 3DGS for urban digital twins and GIS applications, while identifying limitations and future directions in memory efficiency, local multi-scale modeling, and semantic 3D reconstruction.

Abstract

3D urban scene reconstruction and modelling is a crucial research area in remote sensing with numerous applications in academia, commerce, industry, and administration. Recent advancements in view synthesis models have facilitated photorealistic 3D reconstruction solely from 2D images. Leveraging Google Earth imagery, we construct a 3D Gaussian Splatting model of the Waterloo region centered on the University of Waterloo and are able to achieve view-synthesis results far exceeding previous 3D view-synthesis results based on neural radiance fields which we demonstrate in our benchmark. Additionally, we retrieved the 3D geometry of the scene using the 3D point cloud extracted from the 3D Gaussian Splatting model which we benchmarked against our Multi- View-Stereo dense reconstruction of the scene, thereby reconstructing both the 3D geometry and photorealistic lighting of the large-scale urban scene through 3D Gaussian Splatting

Enhanced 3D Urban Scene Reconstruction and Point Cloud Densification using Gaussian Splatting and Google Earth Imagery

TL;DR

The paper tackles large-scale urban 3D reconstruction and photorealistic view synthesis from Google Earth imagery. It proposes 3D Gaussian Splatting, with a differentiable rasterizer and per-Gaussian lighting via spherical harmonics, to achieve fast, photorealistic novel-view synthesis and dense geometry densification in remote sensing scenarios. Results on the Waterloo region and BungeeNeRF-style city scenes show that 3DGS can surpass NeRF-based methods in visual quality while offering substantially faster training times, albeit with higher memory demands and some geometric misalignment relative to COLMAP MVS benchmarks. The work highlights the potential of satellite/aerial imagery-driven 3DGS for urban digital twins and GIS applications, while identifying limitations and future directions in memory efficiency, local multi-scale modeling, and semantic 3D reconstruction.

Abstract

3D urban scene reconstruction and modelling is a crucial research area in remote sensing with numerous applications in academia, commerce, industry, and administration. Recent advancements in view synthesis models have facilitated photorealistic 3D reconstruction solely from 2D images. Leveraging Google Earth imagery, we construct a 3D Gaussian Splatting model of the Waterloo region centered on the University of Waterloo and are able to achieve view-synthesis results far exceeding previous 3D view-synthesis results based on neural radiance fields which we demonstrate in our benchmark. Additionally, we retrieved the 3D geometry of the scene using the 3D point cloud extracted from the 3D Gaussian Splatting model which we benchmarked against our Multi- View-Stereo dense reconstruction of the scene, thereby reconstructing both the 3D geometry and photorealistic lighting of the large-scale urban scene through 3D Gaussian Splatting
Paper Structure (21 sections, 10 equations, 7 figures, 4 tables)

This paper contains 21 sections, 10 equations, 7 figures, 4 tables.

Figures (7)

  • Figure 1: 3D Gaussian Splatting workflow, image from 2023gaussian_splatting. SfM is used to create a sparse point cloud to initialize the 3D Gaussian Splatting model. From these 3D Gaussians, new images are generated via the rasterizer and compared to ground truth images during optimization. Gaussians are densified as required.
  • Figure 2: Plot of camera poses, and reconstruction for region of study: Waterloo scene, centered on the EV1 building at the University of Waterloo. Left: Google Earth Studio camera path. Middle: COLMAP 2016COLMAP SfM reconstructed sparse point cloud with projected color and camera poses (as red dots). Right: https://www.google.com/help/terms_maps/ aerial image of the region of study, sourced from Airbus.
  • Figure 3: Adaptive Gaussian Densification, adapted from 2023gaussian_splatting. Top: Cloning Gaussians to cover regions with under-reconstructed details. Bottom: Splitting Gaussians that cover large area but insufficiently represent local geometry.
  • Figure 4: Ground truth, generated images, and visualization of Gaussian means of our Waterloo scene at different altitudes and orientations. Left: Waterloo scene ground truth; Middle: Waterloo scene 3DGS generated image; Right Waterloo scene visualization of location of each 3DGS Gaussian, i.e. 3D positional mean of each Gaussian. These points were then extracted as point clouds.
  • Figure 5: Ground truth vs rendered images of the New York and San Francisco scenes at different altitudes and orientations. Left to right: New York ground truth; New York render; San Francisco ground truth; San Francisco render.
  • ...and 2 more figures