Table of Contents
Fetching ...

RGB-Only Gaussian Splatting SLAM for Unbounded Outdoor Scenes

Sicheng Yu, Chong Cheng, Yifan Zhou, Xiaojun Yang, Hao Wang

TL;DR

This work tackles RGB only SLAM for unbounded outdoor scenes by introducing OpenGS-SLAM, a pipeline that fuses pairwise pointmap regression for robust pose estimation with a differentiable 3D Gaussian Splatting representation. The system jointly optimizes camera poses and Gaussian map parameters in an end to end differentiable loop, aided by an adaptive scale mapper and a dynamic learning rate strategy to handle complex outdoor motion. Key contributions include the first RGB only 3DGS SLAM for outdoor environments, an end to end differentiable pipeline from pose estimation to 3DGS rendering, and state of the art novel view synthesis on the Waymo dataset, with tracking accuracy reaching a fraction of prior methods. The approach promises practical impact for outdoor autonomous robotics by enabling high fidelity rendering and reliable localization with RGB data alone, while maintaining computational efficiency suitable for real time scenarios.

Abstract

3D Gaussian Splatting (3DGS) has become a popular solution in SLAM, as it can produce high-fidelity novel views. However, previous GS-based methods primarily target indoor scenes and rely on RGB-D sensors or pre-trained depth estimation models, hence underperforming in outdoor scenarios. To address this issue, we propose a RGB-only gaussian splatting SLAM method for unbounded outdoor scenes--OpenGS-SLAM. Technically, we first employ a pointmap regression network to generate consistent pointmaps between frames for pose estimation. Compared to commonly used depth maps, pointmaps include spatial relationships and scene geometry across multiple views, enabling robust camera pose estimation. Then, we propose integrating the estimated camera poses with 3DGS rendering as an end-to-end differentiable pipeline. Our method achieves simultaneous optimization of camera poses and 3DGS scene parameters, significantly enhancing system tracking accuracy. Specifically, we also design an adaptive scale mapper for the pointmap regression network, which provides more accurate pointmap mapping to the 3DGS map representation. Our experiments on the Waymo dataset demonstrate that OpenGS-SLAM reduces tracking error to 9.8\% of previous 3DGS methods, and achieves state-of-the-art results in novel view synthesis. Project Page: https://3dagentworld.github.io/opengs-slam/

RGB-Only Gaussian Splatting SLAM for Unbounded Outdoor Scenes

TL;DR

This work tackles RGB only SLAM for unbounded outdoor scenes by introducing OpenGS-SLAM, a pipeline that fuses pairwise pointmap regression for robust pose estimation with a differentiable 3D Gaussian Splatting representation. The system jointly optimizes camera poses and Gaussian map parameters in an end to end differentiable loop, aided by an adaptive scale mapper and a dynamic learning rate strategy to handle complex outdoor motion. Key contributions include the first RGB only 3DGS SLAM for outdoor environments, an end to end differentiable pipeline from pose estimation to 3DGS rendering, and state of the art novel view synthesis on the Waymo dataset, with tracking accuracy reaching a fraction of prior methods. The approach promises practical impact for outdoor autonomous robotics by enabling high fidelity rendering and reliable localization with RGB data alone, while maintaining computational efficiency suitable for real time scenarios.

Abstract

3D Gaussian Splatting (3DGS) has become a popular solution in SLAM, as it can produce high-fidelity novel views. However, previous GS-based methods primarily target indoor scenes and rely on RGB-D sensors or pre-trained depth estimation models, hence underperforming in outdoor scenarios. To address this issue, we propose a RGB-only gaussian splatting SLAM method for unbounded outdoor scenes--OpenGS-SLAM. Technically, we first employ a pointmap regression network to generate consistent pointmaps between frames for pose estimation. Compared to commonly used depth maps, pointmaps include spatial relationships and scene geometry across multiple views, enabling robust camera pose estimation. Then, we propose integrating the estimated camera poses with 3DGS rendering as an end-to-end differentiable pipeline. Our method achieves simultaneous optimization of camera poses and 3DGS scene parameters, significantly enhancing system tracking accuracy. Specifically, we also design an adaptive scale mapper for the pointmap regression network, which provides more accurate pointmap mapping to the 3DGS map representation. Our experiments on the Waymo dataset demonstrate that OpenGS-SLAM reduces tracking error to 9.8\% of previous 3DGS methods, and achieves state-of-the-art results in novel view synthesis. Project Page: https://3dagentworld.github.io/opengs-slam/

Paper Structure

This paper contains 28 sections, 14 equations, 4 figures, 2 tables.

Figures (4)

  • Figure 1: SLAM System Pipeline: Each frame inputs an RGB image for tracking. The current and previous frames are input as a pair into the Pointmap Regression network for pose estimation, followed by pose optimization based on the current Gaussian map. At keyframes, mapping is performed and the pointmap is processed by the Adaptive Scale Mapper for new Gaussian mapping. Camera pose and Gaussian map are jointly optimized in the local window.
  • Figure 2: Novel View Rendering Results on 4 Waymo segments. For unbounded outdoor scenes, our method renders high-fidelity images, accurately capturing details of vehicles, streets, and buildings. In contrast, MonoGS and GlORIE-SLAM exhibit rendering distortions and blurriness.
  • Figure 3: Comparison of tracking trajectories with MonoGS on 4 segments. Our method greatly enhances tracking accuracy, with no noticeable drift.
  • Figure 4: Ablation study of lr adjustment and pointmap regression: tracking trajectories on two segments. Without them, tracking fails during the process.