Gaussian in the Wild: 3D Gaussian Splatting for Unconstrained Image Collections
Dongbin Zhang, Chuming Wang, Weitao Wang, Peihao Li, Minghan Qin, Haoqian Wang
TL;DR
GS-W introduces a per-point 3D Gaussian splatting framework that separates intrinsic and dynamic appearance at each point, enabling robust novel view synthesis from unconstrained image collections. It adds adaptive sampling across multiple 2D feature maps and a 2D visibility map to handle dynamic lighting, weather, and transient occluders, while maintaining real-time rendering via a tile-based rasterizer. The method outperforms NeRF-based baselines in both rendering quality (PSNR/SSIM/LPIPS) and speed (≈200 FPS, over 1000× faster than some NeRF methods), and ablations confirm the importance of per-point dynamics, sampling, and transient handling. This approach advances unconstrained view synthesis by combining explicit 3D Gaussian representations with disentangled appearance modeling and efficient rendering pipelines, offering flexible appearance tuning and strong generalization across diverse scenes.
Abstract
Novel view synthesis from unconstrained in-the-wild images remains a meaningful but challenging task. The photometric variation and transient occluders in those unconstrained images make it difficult to reconstruct the original scene accurately. Previous approaches tackle the problem by introducing a global appearance feature in Neural Radiance Fields (NeRF). However, in the real world, the unique appearance of each tiny point in a scene is determined by its independent intrinsic material attributes and the varying environmental impacts it receives. Inspired by this fact, we propose Gaussian in the wild (GS-W), a method that uses 3D Gaussian points to reconstruct the scene and introduces separated intrinsic and dynamic appearance feature for each point, capturing the unchanged scene appearance along with dynamic variation like illumination and weather. Additionally, an adaptive sampling strategy is presented to allow each Gaussian point to focus on the local and detailed information more effectively. We also reduce the impact of transient occluders using a 2D visibility map. More experiments have demonstrated better reconstruction quality and details of GS-W compared to NeRF-based methods, with a faster rendering speed. Video results and code are available at https://eastbeanzhang.github.io/GS-W/.
