Table of Contents
Fetching ...

GS-LIVO: Real-Time LiDAR, Inertial, and Visual Multi-sensor Fused Odometry with Gaussian Mapping

Sheng Hong, Chunran Zheng, Yishu Shen, Changze Li, Fu Zhang, Tong Qin, Shaojie Shen

TL;DR

GS-LIVO tackles real-time dense SLAM by replacing traditional colored point clouds with a photorealistic Gaussian-splat map and tightly fusing LiDAR, IMU, and visual data. The core innovations are a global hash-indexed octree Gaussian map with a contiguous FoV-based sliding window for real-time updates, and an IESKF-based tightly coupled estimator that propagates photometric and inertial information. Key contributions include (1) a scalable global Gaussian map, (2) LiDAR-visual joint initialization, (3) incremental sliding-window map optimization, and (4) a tightly coupled multisensor fusion framework capable of running on resource-constrained embedded hardware. Experiments demonstrate significant memory and computation savings, real-time performance (over $10$ Hz indoors and around $3$ Hz outdoors), and competitive localization accuracy with high-fidelity rendering across indoor and outdoor scenes, including deployment on a Jetson Orin NX. The work advances practical Gaussian-based SLAM by delivering online map updates and robust odometry suitable for mobile robots and embedded platforms.

Abstract

In recent years, 3D Gaussian splatting (3D-GS) has emerged as a novel scene representation approach. However, existing vision-only 3D-GS methods often rely on hand-crafted heuristics for point-cloud densification and face challenges in handling occlusions and high GPU memory and computation consumption. LiDAR-Inertial-Visual (LIV) sensor configuration has demonstrated superior performance in localization and dense mapping by leveraging complementary sensing characteristics: rich texture information from cameras, precise geometric measurements from LiDAR, and high-frequency motion data from IMU. Inspired by this, we propose a novel real-time Gaussian-based simultaneous localization and mapping (SLAM) system. Our map system comprises a global Gaussian map and a sliding window of Gaussians, along with an IESKF-based odometry. The global Gaussian map consists of hash-indexed voxels organized in a recursive octree, effectively covering sparse spatial volumes while adapting to different levels of detail and scales. The Gaussian map is initialized through multi-sensor fusion and optimized with photometric gradients. Our system incrementally maintains a sliding window of Gaussians, significantly reducing GPU computation and memory consumption by only optimizing the map within the sliding window. Moreover, we implement a tightly coupled multi-sensor fusion odometry with an iterative error state Kalman filter (IESKF), leveraging real-time updating and rendering of the Gaussian map. Our system represents the first real-time Gaussian-based SLAM framework deployable on resource-constrained embedded systems, demonstrated on the NVIDIA Jetson Orin NX platform. The framework achieves real-time performance while maintaining robust multi-sensor fusion capabilities. All implementation algorithms, hardware designs, and CAD models will be publicly available.

GS-LIVO: Real-Time LiDAR, Inertial, and Visual Multi-sensor Fused Odometry with Gaussian Mapping

TL;DR

GS-LIVO tackles real-time dense SLAM by replacing traditional colored point clouds with a photorealistic Gaussian-splat map and tightly fusing LiDAR, IMU, and visual data. The core innovations are a global hash-indexed octree Gaussian map with a contiguous FoV-based sliding window for real-time updates, and an IESKF-based tightly coupled estimator that propagates photometric and inertial information. Key contributions include (1) a scalable global Gaussian map, (2) LiDAR-visual joint initialization, (3) incremental sliding-window map optimization, and (4) a tightly coupled multisensor fusion framework capable of running on resource-constrained embedded hardware. Experiments demonstrate significant memory and computation savings, real-time performance (over Hz indoors and around Hz outdoors), and competitive localization accuracy with high-fidelity rendering across indoor and outdoor scenes, including deployment on a Jetson Orin NX. The work advances practical Gaussian-based SLAM by delivering online map updates and robust odometry suitable for mobile robots and embedded platforms.

Abstract

In recent years, 3D Gaussian splatting (3D-GS) has emerged as a novel scene representation approach. However, existing vision-only 3D-GS methods often rely on hand-crafted heuristics for point-cloud densification and face challenges in handling occlusions and high GPU memory and computation consumption. LiDAR-Inertial-Visual (LIV) sensor configuration has demonstrated superior performance in localization and dense mapping by leveraging complementary sensing characteristics: rich texture information from cameras, precise geometric measurements from LiDAR, and high-frequency motion data from IMU. Inspired by this, we propose a novel real-time Gaussian-based simultaneous localization and mapping (SLAM) system. Our map system comprises a global Gaussian map and a sliding window of Gaussians, along with an IESKF-based odometry. The global Gaussian map consists of hash-indexed voxels organized in a recursive octree, effectively covering sparse spatial volumes while adapting to different levels of detail and scales. The Gaussian map is initialized through multi-sensor fusion and optimized with photometric gradients. Our system incrementally maintains a sliding window of Gaussians, significantly reducing GPU computation and memory consumption by only optimizing the map within the sliding window. Moreover, we implement a tightly coupled multi-sensor fusion odometry with an iterative error state Kalman filter (IESKF), leveraging real-time updating and rendering of the Gaussian map. Our system represents the first real-time Gaussian-based SLAM framework deployable on resource-constrained embedded systems, demonstrated on the NVIDIA Jetson Orin NX platform. The framework achieves real-time performance while maintaining robust multi-sensor fusion capabilities. All implementation algorithms, hardware designs, and CAD models will be publicly available.
Paper Structure (32 sections, 35 equations, 13 figures, 4 tables)

This paper contains 32 sections, 35 equations, 13 figures, 4 tables.

Figures (13)

  • Figure 1: Components of GS-LIVO for Large-Scale Scenarios: Real-Time Odometry and Gaussian Mapping on Aerial Datasets li2024mars.
  • Figure 2: System overview of GS-LIVO: a real-time LiDAR-Inertial-Visual odometry system with Gaussian Splatting-based mapping. The pipeline performs joint initialization and optimization of Gaussians using multi-sensor data, managed through a hash-indexed octree structure and sliding window mechanism.
  • Figure 3: An overview of the procedures for incrementally updating the sliding window of Gaussians (detailed in Sec.\ref{['sec.map']}).
  • Figure 4: Comparison of map representation delicacy with patch based method. (a) and (b) illustrate the warping transformation results using patch sizes of 32 and 64, respectively. (c) presents the Gaussian rendering results, and (d) presents the reference (ground truth) image.
  • Figure 5: Mapping results of three distinct real-world scenes (a)- (c). Top row: the rendering results from camera poses. Middle row: the rendering results from roaming perspectives. Bottom row: the shapes of scene Gaussians.
  • ...and 8 more figures