SpotlessSplats: Ignoring Distractors in 3D Gaussian Splatting

Sara Sabour; Lily Goli; George Kopanas; Mark Matthews; Dmitry Lagun; Leonidas Guibas; Alec Jacobson; David J. Fleet; Andrea Tagliasacchi

SpotlessSplats: Ignoring Distractors in 3D Gaussian Splatting

Sara Sabour, Lily Goli, George Kopanas, Mark Matthews, Dmitry Lagun, Leonidas Guibas, Alec Jacobson, David J. Fleet, Andrea Tagliasacchi

TL;DR

SpotLessSplats tackles robust 3D reconstruction with 3D Gaussian Splatting ($3DGS$) in the presence of transient distractors by leveraging semantic features from text-to-image diffusion models to detect outliers without supervision. It introduces two masking pipelines, SLS-agg and SLS-mlp, that operate on semantic Feature spaces to identify inliers/outliers and feed a robust training objective to the $3DGS$ representation, complemented by appearance modeling. A gradient-based, utilization-driven pruning (UBP) further compresses the scene by reducing the number of splats while preserving fidelity. Across challenging casual-capture benchmarks, SpotLessSplats achieves state-of-the-art robust reconstructions and notable compute savings, with thorough ablations validating the design choices and revealing limitations tied to diffusion-feature dependence and semantic similarity among distractors.

Abstract

3D Gaussian Splatting (3DGS) is a promising technique for 3D reconstruction, offering efficient training and rendering speeds, making it suitable for real-time applications.However, current methods require highly controlled environments (no moving people or wind-blown elements, and consistent lighting) to meet the inter-view consistency assumption of 3DGS. This makes reconstruction of real-world captures problematic. We present SpotLessSplats, an approach that leverages pre-trained and general-purpose features coupled with robust optimization to effectively ignore transient distractors. Our method achieves state-of-the-art reconstruction quality both visually and quantitatively, on casual captures. Additional results available at: https://spotlesssplats.github.io

SpotlessSplats: Ignoring Distractors in 3D Gaussian Splatting

TL;DR

SpotLessSplats tackles robust 3D reconstruction with 3D Gaussian Splatting (

) in the presence of transient distractors by leveraging semantic features from text-to-image diffusion models to detect outliers without supervision. It introduces two masking pipelines, SLS-agg and SLS-mlp, that operate on semantic Feature spaces to identify inliers/outliers and feed a robust training objective to the

representation, complemented by appearance modeling. A gradient-based, utilization-driven pruning (UBP) further compresses the scene by reducing the number of splats while preserving fidelity. Across challenging casual-capture benchmarks, SpotLessSplats achieves state-of-the-art robust reconstructions and notable compute savings, with thorough ablations validating the design choices and revealing limitations tied to diffusion-feature dependence and semantic similarity among distractors.

Abstract

Paper Structure (30 sections, 12 equations, 11 figures, 1 table)

This paper contains 30 sections, 12 equations, 11 figures, 1 table.

Introduction
Related work
Robustness in NeRF
Precomputed features
Robustness in 3DGS (concurrent works)
Background
Robust optimization of 3DGS
Method
Recognizing distractors
Spatial clustering
Spatio-temporal clustering
Adapting 3DGS to robust optimization
Warm up with scheduled sampling
Trimmed estimators in image-based training
A friendly alternative to "opacity reset"
...and 15 more sections

Figures (11)

Figure 1: Our outlier classification using clustered semantic features covers the distractor balloon fully, but an adapted robust mask from Sabour2023robustnerf misclassifies pixels with similar color to background, as inliers.
Figure 2: Our method accurately reconstructs scenes with different levels of transient occlusion, avoiding leakage of transients or under-reconstruction evident by the quantitative and qualitative results on NeRF On-the-go Ren2024nerfonthego dataset.
Figure 3: Lower and upper error residual labels provide a weak supervision for training an MLP classifier for detecting outlier distractors.
Figure 4: Quantitative and qualitative evaluation on RobustNeRF Sabour2023robustnerf datasets show that SLS outperforms baseline methods on 3DGS and NeRF, by preventing over- or under-masking. $\dagger$ denotes VGG LPIPS computed on NeRF-HuGS results rather than AlexNet LPIPS reported in NeRF-HuGS. 3DGS* denotes 3DGS with utility-based pruning.
Figure 5: SLS reconstructs scenes from NeRF On-the-go Ren2024nerfonthego dataset in great detail. High-occlusion lingering distractors, lead to distractor leaks modeled as noisy floaters in baselines. Our method is free of such artifacts.
...and 6 more figures

SpotlessSplats: Ignoring Distractors in 3D Gaussian Splatting

TL;DR

Abstract

SpotlessSplats: Ignoring Distractors in 3D Gaussian Splatting

Authors

TL;DR

Abstract

Table of Contents

Figures (11)