Evaluating geometric accuracy of NeRF reconstructions compared to SLAM method
Adam Korycki, Colleen Josephson, Steve McGuire
TL;DR
This work evaluates whether Neural Radiance Field (NeRF) reconstructions can deliver metric-geometric accuracy comparable to LiDAR-inertial SLAM for real-world mapping, using a 40 cm PVC pipe as the target. Two NeRF pipelines are built from distinct data streams—iPhone imagery with ARKit poses and robot-sourced imagery with LIOSAM poses—and are benchmarked against a state-of-the-art LIOSAM LiDAR-SLAM cloud. Results show NeRFs produce cleaner geometry with less surface-noise than the LiDAR-based reconstruction, while diameter estimates are within a few centimeters of the ground truth, and iPhone data yields better image-quality metrics. The findings suggest NeRF-based mapping from consumer devices is a viable, accessible alternative for metric tasks and could be integrated to augment SLAM pipelines to reduce drift in real-world applications such as forest mapping and construction surveys.
Abstract
As Neural Radiance Field (NeRF) implementations become faster, more efficient and accurate, their applicability to real world mapping tasks becomes more accessible. Traditionally, 3D mapping, or scene reconstruction, has relied on expensive LiDAR sensing. Photogrammetry can perform image-based 3D reconstruction but is computationally expensive and requires extremely dense image representation to recover complex geometry and photorealism. NeRFs perform 3D scene reconstruction by training a neural network on sparse image and pose data, achieving superior results to photogrammetry with less input data. This paper presents an evaluation of two NeRF scene reconstructions for the purpose of estimating the diameter of a vertical PVC cylinder. One of these are trained on commodity iPhone data and the other is trained on robot-sourced imagery and poses. This neural-geometry is compared to state-of-the-art lidar-inertial SLAM in terms of scene noise and metric-accuracy.
