Table of Contents
Fetching ...

Robust Gaussian Splatting SLAM by Leveraging Loop Closure

Zunjie Zhu, Youxu Fang, Xin Li, Chengang Yan, Feng Xu, Chau Yuen, Yanyan Li

TL;DR

This work tackles drift and rendering quality in Gaussian Splatting SLAM when deployed with rotating multi-RGB-D cameras. It introduces a loop-closure framework that classifies Gaussians by timestamp, performs loop detection via co-visibility and multi-view rendering differences, and applies pose-graph optimization plus bundle adjustment to achieve globally consistent camera trajectories and high-fidelity 3D Gaussian maps. The approach demonstrates state-of-the-art pose estimation and novel-view rendering on synthetic and real datasets, validated against strong GS-based baselines, with open-sourcing planned. The findings suggest that loop-closure-aware GS-SLAM can significantly improve robustness and rendering realism in multi-sensor, rotating-camera SLAM scenarios, enabling more reliable 3D scene reconstruction for robotics and AR/VR applications.

Abstract

3D Gaussian Splatting algorithms excel in novel view rendering applications and have been adapted to extend the capabilities of traditional SLAM systems. However, current Gaussian Splatting SLAM methods, designed mainly for hand-held RGB or RGB-D sensors, struggle with tracking drifts when used with rotating RGB-D camera setups. In this paper, we propose a robust Gaussian Splatting SLAM architecture that utilizes inputs from rotating multiple RGB-D cameras to achieve accurate localization and photorealistic rendering performance. The carefully designed Gaussian Splatting Loop Closure module effectively addresses the issue of accumulated tracking and mapping errors found in conventional Gaussian Splatting SLAM systems. First, each Gaussian is associated with an anchor frame and categorized as historical or novel based on its timestamp. By rendering different types of Gaussians at the same viewpoint, the proposed loop detection strategy considers both co-visibility relationships and distinct rendering outcomes. Furthermore, a loop closure optimization approach is proposed to remove camera pose drift and maintain the high quality of 3D Gaussian models. The approach uses a lightweight pose graph optimization algorithm to correct pose drift and updates Gaussians based on the optimized poses. Additionally, a bundle adjustment scheme further refines camera poses using photometric and geometric constraints, ultimately enhancing the global consistency of scenarios. Quantitative and qualitative evaluations on both synthetic and real-world datasets demonstrate that our method outperforms state-of-the-art methods in camera pose estimation and novel view rendering tasks. The code will be open-sourced for the community.

Robust Gaussian Splatting SLAM by Leveraging Loop Closure

TL;DR

This work tackles drift and rendering quality in Gaussian Splatting SLAM when deployed with rotating multi-RGB-D cameras. It introduces a loop-closure framework that classifies Gaussians by timestamp, performs loop detection via co-visibility and multi-view rendering differences, and applies pose-graph optimization plus bundle adjustment to achieve globally consistent camera trajectories and high-fidelity 3D Gaussian maps. The approach demonstrates state-of-the-art pose estimation and novel-view rendering on synthetic and real datasets, validated against strong GS-based baselines, with open-sourcing planned. The findings suggest that loop-closure-aware GS-SLAM can significantly improve robustness and rendering realism in multi-sensor, rotating-camera SLAM scenarios, enabling more reliable 3D scene reconstruction for robotics and AR/VR applications.

Abstract

3D Gaussian Splatting algorithms excel in novel view rendering applications and have been adapted to extend the capabilities of traditional SLAM systems. However, current Gaussian Splatting SLAM methods, designed mainly for hand-held RGB or RGB-D sensors, struggle with tracking drifts when used with rotating RGB-D camera setups. In this paper, we propose a robust Gaussian Splatting SLAM architecture that utilizes inputs from rotating multiple RGB-D cameras to achieve accurate localization and photorealistic rendering performance. The carefully designed Gaussian Splatting Loop Closure module effectively addresses the issue of accumulated tracking and mapping errors found in conventional Gaussian Splatting SLAM systems. First, each Gaussian is associated with an anchor frame and categorized as historical or novel based on its timestamp. By rendering different types of Gaussians at the same viewpoint, the proposed loop detection strategy considers both co-visibility relationships and distinct rendering outcomes. Furthermore, a loop closure optimization approach is proposed to remove camera pose drift and maintain the high quality of 3D Gaussian models. The approach uses a lightweight pose graph optimization algorithm to correct pose drift and updates Gaussians based on the optimized poses. Additionally, a bundle adjustment scheme further refines camera poses using photometric and geometric constraints, ultimately enhancing the global consistency of scenarios. Quantitative and qualitative evaluations on both synthetic and real-world datasets demonstrate that our method outperforms state-of-the-art methods in camera pose estimation and novel view rendering tasks. The code will be open-sourced for the community.
Paper Structure (23 sections, 14 equations, 5 figures, 7 tables)

This paper contains 23 sections, 14 equations, 5 figures, 7 tables.

Figures (5)

  • Figure 1: Example of loop closure optimization: A comparison of our method without loop optimization (left) and with loop optimization (right) on 3D Gaussian ellipsoid visualization (top) and novel view rendering (bottom).
  • Figure 2: Architecture of the proposed Gaussian Splatting SLAM. The input of our system is the current RGB-D frame from rotating multiple RGB-D cameras. In the camera tracking and Gaussian parameters update process, we utilize differential rasterization results of three cameras to design effective loss functions. If a loop is detected, pose graph optimization is triggered first, then 3D Gaussian positions will be adjusted based on updates of camera poses, and finally a local bundle adjustment module is employed to further refine camera poses, ultimately achieving accurate camera poses and a 3D Gaussian map.
  • Figure 3: Comparison of novel view rendering in virtual sequences. This is also supported by the quantitative results in Table \ref{['virtual w/o jitters']} and \ref{['virtual']}.
  • Figure 4: Comparison of depth error maps on virtual and real-world datasets. Depth error maps obtained by calculating the differences between the rendered images and the ground truth are attached for better comparison. In these maps, shades of blue or cooler tones indicate smaller differences, while reds or warmer tones signify larger discrepancies.
  • Figure 5: Comparison of novel view rendering on the real-world dataset. This is also supported by the quantitative results in Table \ref{['real data']}.