RoDyn-SLAM: Robust Dynamic Dense RGB-D SLAM with Neural Radiance Fields
Haochen Jiang, Yueming Xu, Kejie Li, Jianfeng Feng, Li Zhang
TL;DR
RoDyn-SLAM addresses dense RGB-D SLAM in dynamic environments by combining a neural implicit static map with a robust motion-mask generation pipeline and a divide-and-conquer pose optimization strategy. It fuses optical-flow and semantic priors to filter dynamic regions, and introduces an edge warp loss to enforce geometry consistency across frames, while maintaining a differentiable rendering-based optimization of camera pose and map. Key contributions include the multi-resolution hash-encoded implicit map, the integrated motion mask fusion scheme, and the edge-based tracking that enhances robustness in dynamic scenes, achieving state-of-the-art results on dynamic benchmarks. The approach enables accurate pose estimation and high-fidelity static scene reconstructions, with practical implications for robotics and AR/VR in dynamic indoor environments.
Abstract
Leveraging neural implicit representation to conduct dense RGB-D SLAM has been studied in recent years. However, this approach relies on a static environment assumption and does not work robustly within a dynamic environment due to the inconsistent observation of geometry and photometry. To address the challenges presented in dynamic environments, we propose a novel dynamic SLAM framework with neural radiance field. Specifically, we introduce a motion mask generation method to filter out the invalid sampled rays. This design effectively fuses the optical flow mask and semantic mask to enhance the precision of motion mask. To further improve the accuracy of pose estimation, we have designed a divide-and-conquer pose optimization algorithm that distinguishes between keyframes and non-keyframes. The proposed edge warp loss can effectively enhance the geometry constraints between adjacent frames. Extensive experiments are conducted on the two challenging datasets, and the results show that RoDyn-SLAM achieves state-of-the-art performance among recent neural RGB-D methods in both accuracy and robustness.
