Table of Contents
Fetching ...

LVI-GS: Tightly-coupled LiDAR-Visual-Inertial SLAM using 3D Gaussian Splatting

Huibin Zhao, Weipeng Guan, Peng Lu

TL;DR

This article introduces a tightly coupled LiDAR-visual–inertial SLAM using 3-D Gaussian splatting (LVI-GS), which leverages the complementary characteristics of light detection and ranging (LiDAR) and image sensors to capture both geometric structures and visual details of 3-D scenes.

Abstract

3D Gaussian Splatting (3DGS) has shown its ability in rapid rendering and high-fidelity mapping. In this paper, we introduce LVI-GS, a tightly-coupled LiDAR-Visual-Inertial mapping framework with 3DGS, which leverages the complementary characteristics of LiDAR and image sensors to capture both geometric structures and visual details of 3D scenes. To this end, the 3D Gaussians are initialized from colourized LiDAR points and optimized using differentiable rendering. In order to achieve high-fidelity mapping, we introduce a pyramid-based training approach to effectively learn multi-level features and incorporate depth loss derived from LiDAR measurements to improve geometric feature perception. Through well-designed strategies for Gaussian-Map expansion, keyframe selection, thread management, and custom CUDA acceleration, our framework achieves real-time photo-realistic mapping. Numerical experiments are performed to evaluate the superior performance of our method compared to state-of-the-art 3D reconstruction systems.

LVI-GS: Tightly-coupled LiDAR-Visual-Inertial SLAM using 3D Gaussian Splatting

TL;DR

This article introduces a tightly coupled LiDAR-visual–inertial SLAM using 3-D Gaussian splatting (LVI-GS), which leverages the complementary characteristics of light detection and ranging (LiDAR) and image sensors to capture both geometric structures and visual details of 3-D scenes.

Abstract

3D Gaussian Splatting (3DGS) has shown its ability in rapid rendering and high-fidelity mapping. In this paper, we introduce LVI-GS, a tightly-coupled LiDAR-Visual-Inertial mapping framework with 3DGS, which leverages the complementary characteristics of LiDAR and image sensors to capture both geometric structures and visual details of 3D scenes. To this end, the 3D Gaussians are initialized from colourized LiDAR points and optimized using differentiable rendering. In order to achieve high-fidelity mapping, we introduce a pyramid-based training approach to effectively learn multi-level features and incorporate depth loss derived from LiDAR measurements to improve geometric feature perception. Through well-designed strategies for Gaussian-Map expansion, keyframe selection, thread management, and custom CUDA acceleration, our framework achieves real-time photo-realistic mapping. Numerical experiments are performed to evaluate the superior performance of our method compared to state-of-the-art 3D reconstruction systems.

Paper Structure

This paper contains 20 sections, 12 equations, 6 figures, 3 tables.

Figures (6)

  • Figure 1: Overview of our proposed LVI-GS system.
  • Figure 2: Qualitative performance comparison of diverse 3DGS SLAM systems.
  • Figure 3: Rendering examples and detailed views of four frames from the hkust_campus_00 (m2) sequence r3live. The red line represents the running trajectory, while (a)-(d) show four selected rendered images at various positions, highlighting details such as glass surfaces, tree branches, and steps. The area within the red rectangle is enlarged to facilitate comparison with the groundtruth.
  • Figure 4: Training metrics (PSNR and SSIM) over iterations for a keyframe.
  • Figure 5: Rendering images at different training iterations.
  • ...and 1 more figures