Table of Contents
Fetching ...

Loopy-SLAM: Dense Neural SLAM with Loop Closures

Lorenzo Liso, Erik Sandström, Vladimir Yugay, Luc Van Gool, Martin R. Oswald

TL;DR

Loopy-SLAM tackles drift and map distortion in online dense RGBD SLAM by integrating loop closures into a neural point-cloud submap framework. The method grows submaps progressively, performs frame-to-model tracking, and uses global place recognition to trigger loop closures, which are corrected via a robust pose graph optimization with dense surface constraints. It avoids storing full history by deforming submaps through rigid alignments and concludes with feature fusion and refinement of the global neural map. Empirical results on Replica, TUM-RGBD, and ScanNet show state-of-the-art or competitive reconstruction, tracking, and rendering accuracy, illustrating improved global consistency and scalability compared to hash-grid-based methods. This work offers a practical, scalable dense SLAM solution that leverages loop closures without costly re-integration or history storage.

Abstract

Neural RGBD SLAM techniques have shown promise in dense Simultaneous Localization And Mapping (SLAM), yet face challenges such as error accumulation during camera tracking resulting in distorted maps. In response, we introduce Loopy-SLAM that globally optimizes poses and the dense 3D model. We use frame-to-model tracking using a data-driven point-based submap generation method and trigger loop closures online by performing global place recognition. Robust pose graph optimization is used to rigidly align the local submaps. As our representation is point based, map corrections can be performed efficiently without the need to store the entire history of input frames used for mapping as typically required by methods employing a grid based mapping structure. Evaluation on the synthetic Replica and real-world TUM-RGBD and ScanNet datasets demonstrate competitive or superior performance in tracking, mapping, and rendering accuracy when compared to existing dense neural RGBD SLAM methods. Project page: notchla.github.io/Loopy-SLAM.

Loopy-SLAM: Dense Neural SLAM with Loop Closures

TL;DR

Loopy-SLAM tackles drift and map distortion in online dense RGBD SLAM by integrating loop closures into a neural point-cloud submap framework. The method grows submaps progressively, performs frame-to-model tracking, and uses global place recognition to trigger loop closures, which are corrected via a robust pose graph optimization with dense surface constraints. It avoids storing full history by deforming submaps through rigid alignments and concludes with feature fusion and refinement of the global neural map. Empirical results on Replica, TUM-RGBD, and ScanNet show state-of-the-art or competitive reconstruction, tracking, and rendering accuracy, illustrating improved global consistency and scalability compared to hash-grid-based methods. This work offers a practical, scalable dense SLAM solution that leverages loop closures without costly re-integration or history storage.

Abstract

Neural RGBD SLAM techniques have shown promise in dense Simultaneous Localization And Mapping (SLAM), yet face challenges such as error accumulation during camera tracking resulting in distorted maps. In response, we introduce Loopy-SLAM that globally optimizes poses and the dense 3D model. We use frame-to-model tracking using a data-driven point-based submap generation method and trigger loop closures online by performing global place recognition. Robust pose graph optimization is used to rigidly align the local submaps. As our representation is point based, map corrections can be performed efficiently without the need to store the entire history of input frames used for mapping as typically required by methods employing a grid based mapping structure. Evaluation on the synthetic Replica and real-world TUM-RGBD and ScanNet datasets demonstrate competitive or superior performance in tracking, mapping, and rendering accuracy when compared to existing dense neural RGBD SLAM methods. Project page: notchla.github.io/Loopy-SLAM.
Paper Structure (16 sections, 7 equations, 11 figures, 11 tables)

This paper contains 16 sections, 7 equations, 11 figures, 11 tables.

Figures (11)

  • Figure 1: Benefits of Loopy-SLAM. While Point-SLAM yields high-fidelity reconstructions it does not implement loop closure and may duplicate geometries due to drift. ESLAM is faced by the same problem due to the lack of loop closure. GO-SLAM implements loop closure, but computes rather low quality map geometry. In contrast to GO-SLAM which requires to save the entire history of input frames used for mapping to update the map after loop closures, our approach anchors the neural scene representation on points which can simply be shifted without recomputing the dense map from scratch. We show the ATE RMSE and the depth L1 re-rendering error on the mesh for the TUM-RGBD fr1 room scene.
  • Figure 2: Loopy-SLAM Overview. Given an input RGBD stream, we first track the frame against the current active submap. If a new global keyframe is triggered from the estimated motion, we initialize a new submap, otherwise we continue mapping against the same submap. If a loop is detected between the just completed submap and the past global keyframes, pose graph optimization (PGO) is triggered. First, we compute the loop edge constraints (1) with a coarse to fine dense surface registration technique and then PGO (2) is performed with a robust dense surface registration objective. The poses and submaps are then rigidly corrected to achieve global pose and map alignment (3). Finally, the just triggered new global keyframe is added to the place recognition database.
  • Figure 4: Reconstruction Performance on Replica straub2019replica.\ref{['tab:replica_recon']}: Our method performs better than all existing methods on average. \ref{['fig:replica_recon']}: Compared to ESLAM which uses axis aligned feature planes and GO-SLAM which uses multi-resolution hash grids, Loopy-SLAM has a significant advantage in terms of the accuracy of the reconstructions due to the neural point cloud of dynamic resolution. Moreover, with the pose accuracy we obtain via loop closure, we close the gap to the ground truth further. See specifically the zoomed in visualizations. $^*$Depth L1 for GO-SLAM shows our reproduced results from random poses (GO-SLAM evaluates on ground truth poses).
  • Figure 5: Mesh Evaluation on ScanNet Dai2017ScanNet. Loopy-SLAM yields drift free large scale reconstructions compared to Point-SLAM (scene 54, scene 181, scene 169) and ESLAM (scene 54) and with more accurate geometry compared to GO-SLAM (all scenes) and ESLAM (scene 54, scene 181). The green boxes highlight drifted or poor geometry. The red boxes show the zoomed in view locations.
  • Figure 6: Rendering Performance on Replica straub2019replica. The rendering performance is comparable to Point-SLAM sandstrom2023point, which is expected given that the same neural point cloud scene representation is used.
  • ...and 6 more figures