Table of Contents
Fetching ...

LIV-GaussMap: LiDAR-Inertial-Visual Fusion for Real-time 3D Radiance Field Map Rendering

Sheng Hong, Junjie He, Xinhu Zheng, Chunran Zheng, Shaojie Shen

TL;DR

This work tackles real-time, high-fidelity 3D mapping for large-scale scenes under non-Lambertian conditions by fusing LiDAR, IMU, and camera data. It introduces LIV-GaussMap, representing the scene as differentiable 3D Gaussian splats with ellipsoidal geometry and spherical harmonic radiance coefficients $k_\ell^m$, initialized via size-adaptive LiDAR voxels and refined with photometric gradients. Structure adaptive control densifies under-reconstructed regions and prunes overdense areas to produce hole-free, photorealistic novel-view renderings in real time. Evaluations on diverse LiDAR modalities and datasets show competitive or superior PSNR/SSIM/LPIPS and improved structural metrics, and the authors release open-source software, hardware configurations, and datasets to support community research.

Abstract

We introduce an integrated precise LiDAR, Inertial, and Visual (LIV) multimodal sensor fused mapping system that builds on the differentiable \pre{surface splatting }\now{Gaussians} to improve the mapping fidelity, quality, and structural accuracy. Notably, this is also a novel form of tightly coupled map for LiDAR-visual-inertial sensor fusion. This system leverages the complementary characteristics of LiDAR and visual data to capture the geometric structures of large-scale 3D scenes and restore their visual surface information with high fidelity. The initialization for the scene's surface Gaussians and the sensor's poses of each frame are obtained using a LiDAR-inertial system with the feature of size-adaptive voxels. Then, we optimized and refined the Gaussians using visual-derived photometric gradients to optimize their quality and density. Our method is compatible with various types of LiDAR, including solid-state and mechanical LiDAR, supporting both repetitive and non-repetitive scanning modes. Bolstering structure construction through LiDAR and facilitating real-time generation of photorealistic renderings across diverse LIV datasets. It showcases notable resilience and versatility in generating real-time photorealistic scenes potentially for digital twins and virtual reality, while also holding potential applicability in real-time SLAM and robotics domains. We release our software and hardware and self-collected datasets to benefit the community.

LIV-GaussMap: LiDAR-Inertial-Visual Fusion for Real-time 3D Radiance Field Map Rendering

TL;DR

This work tackles real-time, high-fidelity 3D mapping for large-scale scenes under non-Lambertian conditions by fusing LiDAR, IMU, and camera data. It introduces LIV-GaussMap, representing the scene as differentiable 3D Gaussian splats with ellipsoidal geometry and spherical harmonic radiance coefficients , initialized via size-adaptive LiDAR voxels and refined with photometric gradients. Structure adaptive control densifies under-reconstructed regions and prunes overdense areas to produce hole-free, photorealistic novel-view renderings in real time. Evaluations on diverse LiDAR modalities and datasets show competitive or superior PSNR/SSIM/LPIPS and improved structural metrics, and the authors release open-source software, hardware configurations, and datasets to support community research.

Abstract

We introduce an integrated precise LiDAR, Inertial, and Visual (LIV) multimodal sensor fused mapping system that builds on the differentiable \pre{surface splatting }\now{Gaussians} to improve the mapping fidelity, quality, and structural accuracy. Notably, this is also a novel form of tightly coupled map for LiDAR-visual-inertial sensor fusion. This system leverages the complementary characteristics of LiDAR and visual data to capture the geometric structures of large-scale 3D scenes and restore their visual surface information with high fidelity. The initialization for the scene's surface Gaussians and the sensor's poses of each frame are obtained using a LiDAR-inertial system with the feature of size-adaptive voxels. Then, we optimized and refined the Gaussians using visual-derived photometric gradients to optimize their quality and density. Our method is compatible with various types of LiDAR, including solid-state and mechanical LiDAR, supporting both repetitive and non-repetitive scanning modes. Bolstering structure construction through LiDAR and facilitating real-time generation of photorealistic renderings across diverse LIV datasets. It showcases notable resilience and versatility in generating real-time photorealistic scenes potentially for digital twins and virtual reality, while also holding potential applicability in real-time SLAM and robotics domains. We release our software and hardware and self-collected datasets to benefit the community.
Paper Structure (13 sections, 10 equations, 6 figures, 4 tables)

This paper contains 13 sections, 10 equations, 6 figures, 4 tables.

Figures (6)

  • Figure 1: The real-world experiments were performed in both public datasets and private datasets, including both small-scale indoor environments and large-scale outdoor settings. The image shows our radiance field map of HKU LSK(a), HKU Main Building(b), HKUST GZ Tower C2 outdoor(c) and indoor(e), HKUST GZ Makerspace(d), HKUST GZ Red Bird(f).
  • Figure 2: The figure shows the aerial perspective of the indoor and outdoor scenes of HKUST GZ Tower C1. In contrast to the vision build-up structure by 3D-GS kerbl20233d, our approach yields a more refined structure with few artifacts.
  • Figure 3: The construction process of the map is illustrated in the above figure. (1). Initially, the Gaussians of the scene are derived from a Kalman-filtered LiDAR-inertial system. The surfaces of the scene are estimated using LiDAR measurements and are further developed into ellipsoidal surface Gaussians. (2). We further optimize the Gaussians by using photometric gradients. This optimized map allows us to synthesize new views with precise photometry and generate a hole-free map.
  • Figure 4: The overview of our proposed system: The left side illustrates the sensory inputs and the configuration of the data acquisition equipment. The right side details the software pipeline, showcasing the sequence of processing steps.
  • Figure 5: This box plot illustrates the comparative performance of the leading method and our approach in terms of interpolation and extrapolation across datasets by PSNR values.
  • ...and 1 more figures