Learning Neural Volumetric Pose Features for Camera Localization
Jingyu Lin, Jiaqi Gu, Bojian Wu, Lubin Fan, Renjie Chen, Ligang Liu, Jieping Ye
TL;DR
This work tackles camera localization by addressing the limitations of Absolute Pose Regression (APR) through a neural volumetric pose feature called PoseMap. PoseMap is learned by augmenting NeRF with a dedicated pose branch (NeRF-P) and is trained alongside an APRNet to produce discriminate pose representations and enable novel-view synthesis for data augmentation. The authors introduce a self-supervised online alignment mechanism that leverages unlabelled images to further refine pose features, achieving averages gains of approximately 14.28% in translation and 20.51% in rotation on indoor and outdoor benchmarks, and delivering state-of-the-art APR performance on challenging datasets. Overall, PoseMap demonstrates that neural volumetric features can encode implicit pose information, enabling robust, data-efficient camera localization and paving the way for further integration with structure-based cues. $SE(3)$ poses are regressed in the framework, and the approach benefits from online self-supervision and novel-view synthesis to enhance generalization and accuracy.
Abstract
We introduce a novel neural volumetric pose feature, termed PoseMap, designed to enhance camera localization by encapsulating the information between images and the associated camera poses. Our framework leverages an Absolute Pose Regression (APR) architecture, together with an augmented NeRF module. This integration not only facilitates the generation of novel views to enrich the training dataset but also enables the learning of effective pose features. Additionally, we extend our architecture for self-supervised online alignment, allowing our method to be used and fine-tuned for unlabelled images within a unified framework. Experiments demonstrate that our method achieves 14.28% and 20.51% performance gain on average in indoor and outdoor benchmark scenes, outperforming existing APR methods with state-of-the-art accuracy.
