SING3R-SLAM: Submap-based Indoor Monocular Gaussian SLAM with 3D Reconstruction Priors

Kunyi Li; Michael Niemeyer; Sen Wang; Stefano Gasperini; Nassir Navab; Federico Tombari

SING3R-SLAM: Submap-based Indoor Monocular Gaussian SLAM with 3D Reconstruction Priors

Kunyi Li, Michael Niemeyer, Sen Wang, Stefano Gasperini, Nassir Navab, Federico Tombari

TL;DR

SING3R-SLAM tackles drift and inefficiency in dense monocular SLAM by coupling locally accurate submap reconstructions with a globally consistent Gaussian map. The system uses Sub-Track3R to build and align submaps, and a Gaussian Mapper to jointly refine poses and geometry through multi-view rendering and optimization, with a bidirectional loop-closure mechanism feeding back into tracking. Key contributions include inter- and intra-submap registration, a differentiable global Gaussian map, and a robust backend that enforces global consistency while remaining memory-efficient. Experiments on 7-scenes and ScanNet-v2 show state-of-the-art tracking, high-fidelity geometry, and superior novel-view rendering, with a compact map size around 7 MB, highlighting practical impact for long indoor sequences and downstream tasks like NVS.

Abstract

Recent advances in dense 3D reconstruction enable the accurate capture of local geometry; however, integrating them into SLAM is challenging due to drift and redundant point maps, which limit efficiency and downstream tasks, such as novel view synthesis. To address these issues, we propose SING3R-SLAM, a globally consistent and compact Gaussian-based dense RGB SLAM framework. The key idea is to combine locally consistent 3D reconstructions with a unified global Gaussian representation that jointly refines scene geometry and camera poses, enabling efficient and versatile 3D mapping for multiple downstream applications. SING3R-SLAM first builds locally consistent submaps through our lightweight tracking and reconstruction module, and then progressively aligns and fuses them into a global Gaussian map that enforces cross-view geometric consistency. This global map, in turn, provides feedback to correct local drift and enhance the robustness of tracking. Extensive experiments demonstrate that SING3R-SLAM achieves state-of-the-art tracking, 3D reconstruction, and novel view rendering, resulting in over 12% improvement in tracking and producing finer, more detailed geometry, all while maintaining a compact and memory-efficient global representation on real-world datasets.

SING3R-SLAM: Submap-based Indoor Monocular Gaussian SLAM with 3D Reconstruction Priors

TL;DR

Abstract

SING3R-SLAM: Submap-based Indoor Monocular Gaussian SLAM with 3D Reconstruction Priors

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (7)