NeRF-VIO: Map-Based Visual-Inertial Odometry with Initialization Leveraging Neural Radiance Fields

Yanyu Zhang; Dongming Wang; Jie Xu; Mengyuan Liu; Pengxiang Zhu; Wei Ren

NeRF-VIO: Map-Based Visual-Inertial Odometry with Initialization Leveraging Neural Radiance Fields

Yanyu Zhang, Dongming Wang, Jie Xu, Mengyuan Liu, Pengxiang Zhu, Wei Ren

TL;DR

The paper tackles drift and relocalization challenges in map-based visual-inertial odometry for AR by introducing NeRF-VIO, which uses a pose-initialization MLP to relocalize the first frame within a pre-trained NeRF map and a two-stage update in an MSCKF framework that fuses both captured and NeRF-rendered images. A left-invariant SE(3) geodesic loss and a left-invariant metric on $\mathfrak{se}(3)$ ensure robust initialization across frame changes, while grid-based SSIM mitigates environmental alterations. Experiments on a real AR table dataset show that NeRF-VIO achieves superior initialization accuracy and latency compared to iNeRF and outperforms MSCKF in VIO accuracy, even under significant scene changes. The approach demonstrates practical value for robust, real-time AR localization using a pre-built NeRF map and online NeRF rendering.

Abstract

A prior map serves as a foundational reference for localization in context-aware applications such as augmented reality (AR). Providing valuable contextual information about the environment, the prior map is a vital tool for mitigating drift. In this paper, we propose a map-based visual-inertial localization algorithm (NeRF-VIO) with initialization using neural radiance fields (NeRF). Our algorithm utilizes a multilayer perceptron model and redefines the loss function as the geodesic distance on $SE(3)$, ensuring the invariance of the initialization model under a frame change within $\mathfrak{se}(3)$. The evaluation demonstrates that our model outperforms existing NeRF-based initialization solution in both accuracy and efficiency. By integrating a two-stage update mechanism within a multi-state constraint Kalman filter (MSCKF) framework, the state of NeRF-VIO is constrained by both captured images from an onboard camera and rendered images from a pre-trained NeRF model. The proposed algorithm is validated using a real-world AR dataset, the results indicate that our two-stage update pipeline outperforms MSCKF across all data sequences.

NeRF-VIO: Map-Based Visual-Inertial Odometry with Initialization Leveraging Neural Radiance Fields

TL;DR

Abstract

NeRF-VIO: Map-Based Visual-Inertial Odometry with Initialization Leveraging Neural Radiance Fields

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (8)

Theorems & Definitions (2)