Table of Contents
Fetching ...

NGM-SLAM: Gaussian Splatting SLAM with Radiance Field Submap

Jingwei Huang, Mingrui Li, Lei Sun, Aaron Xuxiang Tian, Tianchen Deng, Hongyu Wang

TL;DR

NGM-SLAM addresses the challenge of large-scale, high-fidelity dense mapping with real-time loop closure by integrating neural radiance field submaps as priors into a 3D Gaussian Splatting SLAM framework. The system uses neural submaps to supervise Gaussian rendering, enabling gap filling and texture-rich reconstruction while maintaining real-time performance through multi-scale Gaussian rendering and pruning. A local-to-global loop closure strategy combines submap-level BA with a coarse-to-fine global adjustment to correct drift, achieving scalable, accurate tracking and mapping across monocular, stereo, and RGB-D inputs. The approach demonstrates state-of-the-art performance on Replica, ScanNet, TUM RGB-D, and EuRoC datasets, with robust hole filling, reduced aliasing, and online loop corrections suitable for large-scale scenes and potential mobile deployment.

Abstract

SLAM systems based on Gaussian Splatting have garnered attention due to their capabilities for rapid real-time rendering and high-fidelity mapping. However, current Gaussian Splatting SLAM systems usually struggle with large scene representation and lack effective loop closure detection. To address these issues, we introduce NGM-SLAM, the first 3DGS based SLAM system that utilizes neural radiance field submaps for progressive scene expression, effectively integrating the strengths of neural radiance fields and 3D Gaussian Splatting. We utilize neural radiance field submaps as supervision and achieve high-quality scene expression and online loop closure adjustments through Gaussian rendering of fused submaps. Our results on multiple real-world scenes and large-scale scene datasets demonstrate that our method can achieve accurate hole filling and high-quality scene expression, supporting monocular, stereo, and RGB-D inputs, and achieving state-of-the-art scene reconstruction and tracking performance.

NGM-SLAM: Gaussian Splatting SLAM with Radiance Field Submap

TL;DR

NGM-SLAM addresses the challenge of large-scale, high-fidelity dense mapping with real-time loop closure by integrating neural radiance field submaps as priors into a 3D Gaussian Splatting SLAM framework. The system uses neural submaps to supervise Gaussian rendering, enabling gap filling and texture-rich reconstruction while maintaining real-time performance through multi-scale Gaussian rendering and pruning. A local-to-global loop closure strategy combines submap-level BA with a coarse-to-fine global adjustment to correct drift, achieving scalable, accurate tracking and mapping across monocular, stereo, and RGB-D inputs. The approach demonstrates state-of-the-art performance on Replica, ScanNet, TUM RGB-D, and EuRoC datasets, with robust hole filling, reduced aliasing, and online loop corrections suitable for large-scale scenes and potential mobile deployment.

Abstract

SLAM systems based on Gaussian Splatting have garnered attention due to their capabilities for rapid real-time rendering and high-fidelity mapping. However, current Gaussian Splatting SLAM systems usually struggle with large scene representation and lack effective loop closure detection. To address these issues, we introduce NGM-SLAM, the first 3DGS based SLAM system that utilizes neural radiance field submaps for progressive scene expression, effectively integrating the strengths of neural radiance fields and 3D Gaussian Splatting. We utilize neural radiance field submaps as supervision and achieve high-quality scene expression and online loop closure adjustments through Gaussian rendering of fused submaps. Our results on multiple real-world scenes and large-scale scene datasets demonstrate that our method can achieve accurate hole filling and high-quality scene expression, supporting monocular, stereo, and RGB-D inputs, and achieving state-of-the-art scene reconstruction and tracking performance.
Paper Structure (8 sections, 9 equations, 4 figures, 5 tables)

This paper contains 8 sections, 9 equations, 4 figures, 5 tables.

Figures (4)

  • Figure 1: The system includes two modules: tracking and mapping. After the initial submap starts to be established, the tracking module continuously estimates the camera pose and detects loops, while passing keyframes of the submap to the mapping module. The mapping module first constructs a neural submap that also serves as a prior for the multi-scale GS(Gaussian Splatting) submap, and performs parallel rendering between submaps. Local Bundle Adjustment (BA) is conducted within submaps to correct pose and mapping errors, and Global BA is executed on all anchor frames when a loop closure is detected. Finally, the resulting GS maps are stitched together.
  • Figure 2: We present scene and local detail results on four sequences in the Replicastraub2019replica dataset, including monocular and RGB-D reconstruction. Our method exhibits superior detail expression and overall reconstruction, while preserving the finest texture details.
  • Figure 3: The reconstruction results on four large-scale apartment sequences, each consisting of multiple rooms, in the Replicastraub2019replica dataset demonstrate that our method achieves more accurate reconstruction compared to Nerf-based approaches and state-of-the-art MonoGSMatsuki:Murai:etal:CVPR2024. It avoids catastrophic forgetting. Moreover, as demonstrated in the final sequence showcasing window details, we can achieve reasonable background completion and scene generalization.
  • Figure 4: On large-scale multi-room sequences in the ScanNet dataset, our method demonstrates superior error accumulation correction capability compared to current 3DGS-based approaches. We can accurately ensure consistency across multiple views, avoiding erroneous scene reconstructions such as blurry shoes and bicycles, while also preventing local detail collapse caused by aliasing artifacts.