From Sparse to Dense: Camera Relocalization with Scene-Specific Detector from Feature Gaussian Splatting
Zhiwei Huang, Hailin Yu, Yichun Shentu, Jin Yuan, Guofeng Zhang
TL;DR
STDLoc addresses robust camera relocalization by introducing a scene-specific Feature Gaussian representation and a sparse-to-dense localization pipeline. It replaces traditional image retrieval-based localization with a matching-oriented landmark sampling and a scene-specific detector to obtain a reliable initial pose, followed by dense feature-map alignment for refinement. Key contributions include the matching-oriented sampling strategy, a self-supervised scene-specific detector, and a full sparse-to-dense localization framework that leverages a learned feature field for accurate 6DoF pose estimation in both indoor and outdoor environments. Empirical results on 7-Scenes and Cambridge Landmarks show that STDLoc achieves state-of-the-art localization accuracy and recall, with robustness to illumination and weak textures and practical running-time performance around several FPS on modern GPUs.
Abstract
This paper presents a novel camera relocalization method, STDLoc, which leverages Feature Gaussian as scene representation. STDLoc is a full relocalization pipeline that can achieve accurate relocalization without relying on any pose prior. Unlike previous coarse-to-fine localization methods that require image retrieval first and then feature matching, we propose a novel sparse-to-dense localization paradigm. Based on this scene representation, we introduce a novel matching-oriented Gaussian sampling strategy and a scene-specific detector to achieve efficient and robust initial pose estimation. Furthermore, based on the initial localization results, we align the query feature map to the Gaussian feature field by dense feature matching to enable accurate localization. The experiments on indoor and outdoor datasets show that STDLoc outperforms current state-of-the-art localization methods in terms of localization accuracy and recall.
