Two-Stage Gaussian Splatting Optimization for Outdoor Scene Reconstruction
Deborah Pintani, Ariel Caputo, Noah Lewis, Marc Stamminger, Fabio Pellacini, Andrea Giachetti
TL;DR
This work tackles outdoor scene reconstruction with large background regions by introducing a two-stage Gaussian Splatting pipeline that explicitly separates background and foreground using two concentric shells. Stage 1 models the background on an outer spherical shell with $L_{shell}$ and $L_{planarity}$ losses and a visibility-based pruning strategy, while Stage 2 adds the foreground within the inner region using the standard GS loss and boundary pruning. The approach yields cleaner background representations, reduces floaters, and enables automatic environment-map generation from the background, with quantitative gains over baselines across five outdoor datasets and practical implications for VR and mixed-reality rendering. The method advances outdoor GS by stabilizing distant-region reconstruction and facilitating photorealistic environment maps, at the cost of extended training time which does not impact runtime rendering performance. $L_{shell}$ and $L_{planarity}$ play key roles in preserving stable background geometry and mitigating radial artifacts, contributing to perceptually superior novel view synthesis in challenging outdoor scenes.
Abstract
Outdoor scene reconstruction remains challenging due to the stark contrast between well-textured, nearby regions and distant backgrounds dominated by low detail, uneven illumination, and sky effects. We introduce a two-stage Gaussian Splatting framework that explicitly separates and optimizes these regions, yielding higher-fidelity novel view synthesis. In stage one, background primitives are initialized within a spherical shell and optimized using a loss that combines a background-only photometric term with two geometric regularizers: one constraining Gaussians to remain inside the shell, and another aligning them with local tangential planes. In stage two, foreground Gaussians are initialized from a Structure-from-Motion reconstruction, added and refined using the standard rendering loss, while the background set remains fixed but contributes to the final image formation. Experiments on diverse outdoor datasets show that our method reduces background artifacts and improves perceptual quality compared to state-of-the-art baselines. Moreover, the explicit background separation enables automatic, object-free environment map estimation, opening new possibilities for photorealistic outdoor rendering and mixed-reality applications.
