WildGS-SLAM: Monocular Gaussian Splatting SLAM in Dynamic Environments
Jianhao Zheng, Zihan Zhu, Valentin Bieri, Marc Pollefeys, Songyou Peng, Iro Armeni
TL;DR
WildGS-SLAM presents a monocular SLAM framework that robustly operates in dynamic environments by representing the static scene with a $3D$ Gaussian map and predicting per-pixel uncertainty via an online MLP driven by DINOv2 features. Uncertainty informs both tracking (uncertainty-weighted dense bundle adjustment) and mapping (uncertainty-aware render loss), enabling dynamic object removal without depth or semantic priors. The approach yields artifact-free novel view synthesis and state-of-the-art performance on dynamic benchmarks, including newly collected Wild-SLAM MoCap and iPhone datasets. The work contributes a practical, generalizable dynamic-SLAM solution and a comprehensive Wild-SLAM dataset for broader evaluation in unconstrained real-world scenarios.
Abstract
We present WildGS-SLAM, a robust and efficient monocular RGB SLAM system designed to handle dynamic environments by leveraging uncertainty-aware geometric mapping. Unlike traditional SLAM systems, which assume static scenes, our approach integrates depth and uncertainty information to enhance tracking, mapping, and rendering performance in the presence of moving objects. We introduce an uncertainty map, predicted by a shallow multi-layer perceptron and DINOv2 features, to guide dynamic object removal during both tracking and mapping. This uncertainty map enhances dense bundle adjustment and Gaussian map optimization, improving reconstruction accuracy. Our system is evaluated on multiple datasets and demonstrates artifact-free view synthesis. Results showcase WildGS-SLAM's superior performance in dynamic environments compared to state-of-the-art methods.
