Table of Contents
Fetching ...

RGBD GS-ICP SLAM

Seongbo Ha, Jiung Yeon, Hyeonwoo Yu

TL;DR

This paper proposes a novel dense representation SLAM approach with a fusion of Generalized Iterative Closest Point (G-ICP) and 3D Gaussian Splatting (3DGS), utilizing a single Gaussian map for both tracking and mapping, resulting in mutual benefits.

Abstract

Simultaneous Localization and Mapping (SLAM) with dense representation plays a key role in robotics, Virtual Reality (VR), and Augmented Reality (AR) applications. Recent advancements in dense representation SLAM have highlighted the potential of leveraging neural scene representation and 3D Gaussian representation for high-fidelity spatial representation. In this paper, we propose a novel dense representation SLAM approach with a fusion of Generalized Iterative Closest Point (G-ICP) and 3D Gaussian Splatting (3DGS). In contrast to existing methods, we utilize a single Gaussian map for both tracking and mapping, resulting in mutual benefits. Through the exchange of covariances between tracking and mapping processes with scale alignment techniques, we minimize redundant computations and achieve an efficient system. Additionally, we enhance tracking accuracy and mapping quality through our keyframe selection methods. Experimental results demonstrate the effectiveness of our approach, showing an incredibly fast speed up to 107 FPS (for the entire system) and superior quality of the reconstructed map.

RGBD GS-ICP SLAM

TL;DR

This paper proposes a novel dense representation SLAM approach with a fusion of Generalized Iterative Closest Point (G-ICP) and 3D Gaussian Splatting (3DGS), utilizing a single Gaussian map for both tracking and mapping, resulting in mutual benefits.

Abstract

Simultaneous Localization and Mapping (SLAM) with dense representation plays a key role in robotics, Virtual Reality (VR), and Augmented Reality (AR) applications. Recent advancements in dense representation SLAM have highlighted the potential of leveraging neural scene representation and 3D Gaussian representation for high-fidelity spatial representation. In this paper, we propose a novel dense representation SLAM approach with a fusion of Generalized Iterative Closest Point (G-ICP) and 3D Gaussian Splatting (3DGS). In contrast to existing methods, we utilize a single Gaussian map for both tracking and mapping, resulting in mutual benefits. Through the exchange of covariances between tracking and mapping processes with scale alignment techniques, we minimize redundant computations and achieve an efficient system. Additionally, we enhance tracking accuracy and mapping quality through our keyframe selection methods. Experimental results demonstrate the effectiveness of our approach, showing an incredibly fast speed up to 107 FPS (for the entire system) and superior quality of the reconstructed map.
Paper Structure (16 sections, 6 equations, 5 figures, 8 tables)

This paper contains 16 sections, 6 equations, 5 figures, 8 tables.

Figures (5)

  • Figure 1: A comparison of PSNR with respect to FPS of entire system in recent research on SLAM algorithm utilizing dense representation such as neural scene representation and 3D Gaussian representation. Our method achieves state-of-the-art performance in rendering evaluation and FPS of entire system. Note that this FPS represents the overall system performance. Reported values are average of Replica 8 scenes.
  • Figure 2: System Overview. The input of our system is RGBD frame. We generate a point cloud by downsampling and reprojecting the current depth image and utilize it in the GICP process. During the GICP process, we create source Gaussians from the point cloud and estimate the current camera pose by aligning them with target Gaussians, which are a subset of the 3D GS map. If the current frame is identified as a keyframe or a mapping-only keyframe, we add the source Gaussians to the 3D GS map as new primitives. Meanwhile, in the mapping process, we optimize the Gaussians along with the color and opacity set of the Gaussians concurrently with the tracking process.
  • Figure 3: Tracking Accuracy Comparison Based on Keyframe Selection Methods. The reported values represent the average results across eight scenes from the Replica dataset straub2019replica. When selecting keyframes every n frame (depicted in blue), the tracking accuracy is notably low. Conversely, our keyframe selection method yielded the highest tracking accuracy.
  • Figure 4: Separated Keyframe Selection on Replica office4. We demonstrate that a small number of tracking keyframes yield accurate trajectory estimation (case 1), while a large number of mapping keyframes result in high rendering performance (case 2). Thus, our method adopts case 3 that select tracking keyframe and mapping keyframe separately at different intervals.
  • Figure 5: Comparison of Rendering Results. In the first scene, SplaTAM splatam failed to reconstruct the pillow and lamp. Point-SLAM pointslam failed to represent the detailed pattern of the pillow. In the second case, SplaTAM and Point-SLAM failed to accurately reconstruct the details of the clock. However, our method exhibits rendering results that closely resemble the ground truth image, demonstrating a high level of accuracy.