SiLVR: Scalable Lidar-Visual Radiance Field Reconstruction with Uncertainty Quantification

Yifu Tao; Maurice Fallon

SiLVR: Scalable Lidar-Visual Radiance Field Reconstruction with Uncertainty Quantification

Yifu Tao, Maurice Fallon

TL;DR

SiLVR presents a scalable lidar-visual NeRF framework for large-scale 3D reconstruction that integrates depth and surface-normal cues from LiDAR with multi-view imagery. By embedding a perturbation-based perturbation field and applying the Laplace approximation, it yields an explicit epistemic uncertainty map ($oldsymbol{H}^{-1}$) to quantify sensor contributions and filter artefacts, especially at submap boundaries. The system uses depth-KL and normal regularisation, sky segmentation, and visibility-based submapping, complemented by COLMAP-based pose refinement, to deliver geometrically accurate maps with photoreal textures across over $20{,}000~ ext{m}^2$ of real-world data. This uncertainty-aware, large-scale fusion enables more reliable navigation, view planning, and mapping in robotics applications where textureless or occluded regions pose challenges.

Abstract

We present a neural radiance field (NeRF) based large-scale reconstruction system that fuses lidar and vision data to generate high-quality reconstructions that are geometrically accurate and capture photorealistic texture. Our system adopts the state-of-the-art NeRF representation to incorporate lidar. Adding lidar data adds strong geometric constraints on the depth and surface normals, which is particularly useful when modelling uniform texture surfaces which contain ambiguous visual reconstruction cues. A key contribution of this work is a novel method to quantify the epistemic uncertainty of the lidar-visual NeRF reconstruction by estimating the spatial variance of each point location in the radiance field given the sensor observations from the cameras and lidar. This provides a principled approach to evaluate the contribution of each sensor modality to the final reconstruction. In this way, reconstructions that are uncertain (due to e.g. uniform visual texture, limited observation viewpoints, or little lidar coverage) can be identified and removed. Our system is integrated with a real-time lidar SLAM system which is used to bootstrap a Structure-from-Motion (SfM) reconstruction procedure. It also helps to properly constrain the overall metric scale which is essential for the lidar depth loss. The refined SLAM trajectory can then be divided into submaps using Spectral Clustering to group sets of co-visible images together. This submapping approach is more suitable for visual reconstruction than distance-based partitioning. Our uncertainty estimation is particularly effective when merging submaps as their boundaries often contain artefacts due to limited observations. We demonstrate the reconstruction system using a multi-camera, lidar sensor suite in experiments involving both robot-mounted and handheld scanning. Our test datasets cover a total area of more than 20,000 square metres.

SiLVR: Scalable Lidar-Visual Radiance Field Reconstruction with Uncertainty Quantification

TL;DR

Abstract

SiLVR: Scalable Lidar-Visual Radiance Field Reconstruction with Uncertainty Quantification

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (12)