Ternary-Type Opacity and Hybrid Odometry for RGB NeRF-SLAM
Junru Lin, Asen Nachkov, Songyou Peng, Luc Van Gool, Danda Pani Paudel
TL;DR
This work tackles RGB-only NeRF-SLAM by introducing a ternary-type opacity (TT) prior and a hybrid odometry (HO) pipeline. TT concentrates ray weights near surface depth via a softly-binarized decoder, enabling more accurate depth rendering and faster map convergence, while HO combines gradient-based warping for coarse pose initialization with bundle adjustment for refinement to boost speed and robustness. The approach yields state-of-the-art results on Replica and 7-Scenes in both tracking and mapping metrics, with reported speedups of about 6x over baselines like DIM-SLAM and substantial robustness to reduced BA iterations. Together, TT and HO provide a practical path toward efficient RGB-only NeRF-SLAM that leverages real-world surface priors to improve fidelity and speed.
Abstract
In this work, we address the challenge of deploying Neural Radiance Field (NeRFs) in Simultaneous Localization and Mapping (SLAM) under the condition of lacking depth information, relying solely on RGB inputs. The key to unlocking the full potential of NeRF in such a challenging context lies in the integration of real-world priors. A crucial prior we explore is the binary opacity prior of 3D space with opaque objects. To effectively incorporate this prior into the NeRF framework, we introduce a ternary-type opacity (TT) model instead, which categorizes points on a ray intersecting a surface into three regions: before, on, and behind the surface. This enables a more accurate rendering of depth, subsequently improving the performance of image warping techniques. Therefore, we further propose a novel hybrid odometry (HO) scheme that merges bundle adjustment and warping-based localization. Our integrated approach of TT and HO achieves state-of-the-art performance on synthetic and real-world datasets, in terms of both speed and accuracy. This breakthrough underscores the potential of NeRF-SLAM in navigating complex environments with high fidelity.
