Improved 3D Point-Line Mapping Regression for Camera Relocalization
Bach-Thuan Bui, Huy-Hoang Bui, Yasuyuki Fujii, Dinh-Tuan Tran, Joo-Ho Lee
TL;DR
This work tackles camera relocalization by separating the learning of 3D coordinates for points and lines into two dedicated regression branches, mitigating bias from feature imbalance. It introduces a focus-mode architecture with an early learnable pruning layer and self-attention modules to robustly refine descriptors before regression, plus a line transformer encoder for line features. The approach demonstrates consistent improvements over prior regression-based methods on 7Scenes and Indoor-6, achieving competitive performance relative to FM-based systems while reducing storage and computational demands. The method is validated with thorough ablations and practical considerations, and code is released to enable public use and benchmarking.
Abstract
In this paper, we present a new approach for improving 3D point and line mapping regression for camera re-localization. Previous methods typically rely on feature matching (FM) with stored descriptors or use a single network to encode both points and lines. While FM-based methods perform well in large-scale environments, they become computationally expensive with a growing number of mapping points and lines. Conversely, approaches that learn to encode mapping features within a single network reduce memory footprint but are prone to overfitting, as they may capture unnecessary correlations between points and lines. We propose that these features should be learned independently, each with a distinct focus, to achieve optimal accuracy. To this end, we introduce a new architecture that learns to prioritize each feature independently before combining them for localization. Experimental results demonstrate that our approach significantly enhances the 3D map point and line regression performance for camera re-localization. The implementation of our method will be publicly available at: https://github.com/ais-lab/pl2map/.
