Table of Contents
Fetching ...

Evaluating geometric accuracy of NeRF reconstructions compared to SLAM method

Adam Korycki, Colleen Josephson, Steve McGuire

TL;DR

This work evaluates whether Neural Radiance Field (NeRF) reconstructions can deliver metric-geometric accuracy comparable to LiDAR-inertial SLAM for real-world mapping, using a 40 cm PVC pipe as the target. Two NeRF pipelines are built from distinct data streams—iPhone imagery with ARKit poses and robot-sourced imagery with LIOSAM poses—and are benchmarked against a state-of-the-art LIOSAM LiDAR-SLAM cloud. Results show NeRFs produce cleaner geometry with less surface-noise than the LiDAR-based reconstruction, while diameter estimates are within a few centimeters of the ground truth, and iPhone data yields better image-quality metrics. The findings suggest NeRF-based mapping from consumer devices is a viable, accessible alternative for metric tasks and could be integrated to augment SLAM pipelines to reduce drift in real-world applications such as forest mapping and construction surveys.

Abstract

As Neural Radiance Field (NeRF) implementations become faster, more efficient and accurate, their applicability to real world mapping tasks becomes more accessible. Traditionally, 3D mapping, or scene reconstruction, has relied on expensive LiDAR sensing. Photogrammetry can perform image-based 3D reconstruction but is computationally expensive and requires extremely dense image representation to recover complex geometry and photorealism. NeRFs perform 3D scene reconstruction by training a neural network on sparse image and pose data, achieving superior results to photogrammetry with less input data. This paper presents an evaluation of two NeRF scene reconstructions for the purpose of estimating the diameter of a vertical PVC cylinder. One of these are trained on commodity iPhone data and the other is trained on robot-sourced imagery and poses. This neural-geometry is compared to state-of-the-art lidar-inertial SLAM in terms of scene noise and metric-accuracy.

Evaluating geometric accuracy of NeRF reconstructions compared to SLAM method

TL;DR

This work evaluates whether Neural Radiance Field (NeRF) reconstructions can deliver metric-geometric accuracy comparable to LiDAR-inertial SLAM for real-world mapping, using a 40 cm PVC pipe as the target. Two NeRF pipelines are built from distinct data streams—iPhone imagery with ARKit poses and robot-sourced imagery with LIOSAM poses—and are benchmarked against a state-of-the-art LIOSAM LiDAR-SLAM cloud. Results show NeRFs produce cleaner geometry with less surface-noise than the LiDAR-based reconstruction, while diameter estimates are within a few centimeters of the ground truth, and iPhone data yields better image-quality metrics. The findings suggest NeRF-based mapping from consumer devices is a viable, accessible alternative for metric tasks and could be integrated to augment SLAM pipelines to reduce drift in real-world applications such as forest mapping and construction surveys.

Abstract

As Neural Radiance Field (NeRF) implementations become faster, more efficient and accurate, their applicability to real world mapping tasks becomes more accessible. Traditionally, 3D mapping, or scene reconstruction, has relied on expensive LiDAR sensing. Photogrammetry can perform image-based 3D reconstruction but is computationally expensive and requires extremely dense image representation to recover complex geometry and photorealism. NeRFs perform 3D scene reconstruction by training a neural network on sparse image and pose data, achieving superior results to photogrammetry with less input data. This paper presents an evaluation of two NeRF scene reconstructions for the purpose of estimating the diameter of a vertical PVC cylinder. One of these are trained on commodity iPhone data and the other is trained on robot-sourced imagery and poses. This neural-geometry is compared to state-of-the-art lidar-inertial SLAM in terms of scene noise and metric-accuracy.
Paper Structure (11 sections, 1 equation, 4 figures, 1 table)

This paper contains 11 sections, 1 equation, 4 figures, 1 table.

Figures (4)

  • Figure 1: The robot platform (left) used for LiDAR-inertial SLAM reconstruction (right).
  • Figure 2: The Nerfacto network architecture comprised of two MLPs with positional encoding to approximate high-frequency volume functions and appearance embeddings to account for varying exposure in training images (Tancik et al. nerfstudio).
  • Figure 3: NeRF-synthesized novel views of the PVC pipe. Top views are neural renderings based on iPhone image and pose training data and bottom views are are rendered from robot sourced data.
  • Figure 4: PVC pipe point cloud reconstructions produced by lidar-inertial SLAM (LIOSAM), NeRF-LIOSAM fusion, and NeRFCapture methods. Corresponding 2D projection of points used by TreeTool for cylinder-ellipse modeling are shown in red. The blue circles represent the fitted cylinder-ellipse models. These results provide two key insights. First, the RANSAC cylinder fitting method used in TreeTool is biased towards circular portions of the point cloud which explains the under-sizing trend in the estimated diameters. Second, the NeRF reconstructions have significantly less noise deviation compared to the SLAM reconstruction. This is likely due to sub-optimal LiDAR-inertial extrinsic calibration.