T-3DGS: Removing Transient Objects for 3D Scene Reconstruction
Alexander Markin, Vadim Pryadilshchikov, Artem Komarichev, Ruslan Rakhimov, Peter Wonka, Evgeny Burnaev
TL;DR
T-3DGS tackles the challenge of removing transient objects from video to enable high-fidelity 3D scene reconstruction with Gaussian Splatting. It introduces an unsupervised Reconstruction Uncertainty Predictor (RUP) that uses semantic features from DINOv2 and a bivariate residual model to identify transient regions, complemented by a KL-divergence-based regularization for robust mask generation. A Segmentation- and SAM-based Transient Mask Refinement (TMR) pipeline propagates and refines masks across frames to handle semi-transient objects, while depth-aware regularization reduces artifacts near boundaries. The combination of these components yields artifact-free reconstructions with improved temporal coherence and boundary accuracy, outperforming prior methods on both sparsely and densely captured datasets. The approach advances robust 3D scene reconstruction in uncontrolled real-world settings and provides a dataset and evaluation framework for semi-transient scenarios.
Abstract
Transient objects in video sequences can significantly degrade the quality of 3D scene reconstructions. To address this challenge, we propose T-3DGS, a novel framework that robustly filters out transient distractors during 3D reconstruction using Gaussian Splatting. Our framework consists of two steps. First, we employ an unsupervised classification network that distinguishes transient objects from static scene elements by leveraging their distinct training dynamics within the reconstruction process. Second, we refine these initial detections by integrating an off-the-shelf segmentation method with a bidirectional tracking module, which together enhance boundary accuracy and temporal coherence. Evaluations on both sparsely and densely captured video datasets demonstrate that T-3DGS significantly outperforms state-of-the-art approaches, enabling high-fidelity 3D reconstructions in challenging, real-world scenarios.
