SpotlessSplats: Ignoring Distractors in 3D Gaussian Splatting
Sara Sabour, Lily Goli, George Kopanas, Mark Matthews, Dmitry Lagun, Leonidas Guibas, Alec Jacobson, David J. Fleet, Andrea Tagliasacchi
TL;DR
SpotLessSplats tackles robust 3D reconstruction with 3D Gaussian Splatting ($3DGS$) in the presence of transient distractors by leveraging semantic features from text-to-image diffusion models to detect outliers without supervision. It introduces two masking pipelines, SLS-agg and SLS-mlp, that operate on semantic Feature spaces to identify inliers/outliers and feed a robust training objective to the $3DGS$ representation, complemented by appearance modeling. A gradient-based, utilization-driven pruning (UBP) further compresses the scene by reducing the number of splats while preserving fidelity. Across challenging casual-capture benchmarks, SpotLessSplats achieves state-of-the-art robust reconstructions and notable compute savings, with thorough ablations validating the design choices and revealing limitations tied to diffusion-feature dependence and semantic similarity among distractors.
Abstract
3D Gaussian Splatting (3DGS) is a promising technique for 3D reconstruction, offering efficient training and rendering speeds, making it suitable for real-time applications.However, current methods require highly controlled environments (no moving people or wind-blown elements, and consistent lighting) to meet the inter-view consistency assumption of 3DGS. This makes reconstruction of real-world captures problematic. We present SpotLessSplats, an approach that leverages pre-trained and general-purpose features coupled with robust optimization to effectively ignore transient distractors. Our method achieves state-of-the-art reconstruction quality both visually and quantitatively, on casual captures. Additional results available at: https://spotlesssplats.github.io
