RePose-NeRF: Robust Radiance Fields for Mesh Reconstruction under Noisy Camera Poses
Sriram Srinivasan, Gautam Ramachandra
TL;DR
RePose-NeRF tackles robust 3D reconstruction from multi-view images with noisy poses by jointly refining camera extrinsics $\boldsymbol{\theta}_i \in \mathfrak{se}(3)$ and learning an implicit scene representation. It introduces a two-stage pipeline: Stage 1 learns a grid-based NeRF with pose refinement and a SDF-based geometry representation, while Stage 2 performs differentiable mesh refinement and texture baking to produce editable, view-consistent meshes compatible with standard graphics and robotics tools. The method leverages a coarse-to-fine hashing strategy, Eikonal and entropy regularizations, and an occupancy-grid sampling strategy to achieve fast convergence and robust reconstruction under pose uncertainty, outperforming BARF on LLFF and Blender NeRF-Synthetic datasets. The resulting textured meshes enable direct deployment in perception, manipulation, and simulation pipelines, bridging neural implicit representations with practical robotic applications.
Abstract
Accurate 3D reconstruction from multi-view images is essential for downstream robotic tasks such as navigation, manipulation, and environment understanding. However, obtaining precise camera poses in real-world settings remains challenging, even when calibration parameters are known. This limits the practicality of existing NeRF-based methods that rely heavily on accurate extrinsic estimates. Furthermore, their implicit volumetric representations differ significantly from the widely adopted polygonal meshes, making rendering and manipulation inefficient in standard 3D software. In this work, we propose a robust framework that reconstructs high-quality, editable 3D meshes directly from multi-view images with noisy extrinsic parameters. Our approach jointly refines camera poses while learning an implicit scene representation that captures fine geometric detail and photorealistic appearance. The resulting meshes are compatible with common 3D graphics and robotics tools, enabling efficient downstream use. Experiments on standard benchmarks demonstrate that our method achieves accurate and robust 3D reconstruction under pose uncertainty, bridging the gap between neural implicit representations and practical robotic applications.
