GARAD-SLAM: 3D GAussian splatting for Real-time Anti Dynamic SLAM
Mingrui Li, Weijian Chen, Na Cheng, Jingyuan Xu, Dong Li, Hongyu Wang
TL;DR
GARAD-SLAM addresses dynamic interference in 3D Gaussian Splatting SLAM by combining back-end CRF-based segmentation over Gaussians with a Gaussian pyramid network that maps labels to front-end features, and by enforcing a dynamic rendering penalty during optimization. It directly labels Gaussians as static or dynamic by minimizing the Gibbs energy $E(X)=\sum_i \psi_u(x_i)+\sum_{i<j} \psi_p(x_i,x_j)$ with unary potentials derived from a Gaussian Mixture Model, and it uses a rendering loss $\mathcal{L}=\lambda_{p-ssim}\mathcal{L}_{p-ssim}+\lambda_{dyn}\mathcal{L}_{dyn}$ where $\mathcal{L}_{dyn}=\sum_{i\in \mathcal{G}_D}{\alpha_i}^2$. The approach tightly couples tracking and mapping, reduces dynamic artifacts, and yields state-of-the-art rendering quality while maintaining competitive tracking on real-world datasets. It runs in real time on commodity GPUs, enabling robust dense reconstruction in dynamic environments for AR/VR and robotics.
Abstract
The 3D Gaussian Splatting (3DGS)-based SLAM system has garnered widespread attention due to its excellent performance in real-time high-fidelity rendering. However, in real-world environments with dynamic objects, existing 3DGS-based SLAM systems often face mapping errors and tracking drift issues. To address these problems, we propose GARAD-SLAM, a real-time 3DGS-based SLAM system tailored for dynamic scenes. In terms of tracking, unlike traditional methods, we directly perform dynamic segmentation on Gaussians and map them back to the front-end to obtain dynamic point labels through a Gaussian pyramid network, achieving precise dynamic removal and robust tracking. For mapping, we impose rendering penalties on dynamically labeled Gaussians, which are updated through the network, to avoid irreversible erroneous removal caused by simple pruning. Our results on real-world datasets demonstrate that our method is competitive in tracking compared to baseline methods, generating fewer artifacts and higher-quality reconstructions in rendering.
