A Guide to Structureless Visual Localization
Vojtech Panek, Qunjie Zhou, Yaqing Ding, Sérgio Agostinho, Zuzana Kukelova, Torsten Sattler, Laura Leal-Taixé
TL;DR
This paper surveys structureless visual localization methods, contrasting them with traditional structure-based approaches that rely on explicit 3D scene models. It provides a comprehensive review and an extensive experimental comparison across families such as pose triangulation, semi-generalized relative pose estimation, local SfM on the fly, and relative pose regression, using datasets like Aachen Day-Night, Extended CMU Seasons, and NAVER indoor scenes. The findings show that approaches with stronger geometric reasoning achieve higher pose accuracy, with local SfM on the fly delivering the best results, while semi-generalized relative pose estimation offers the best accuracy–runtime trade-off; regression-based methods remain behind geometry-based methods. Overall, structureless methods can be competitive with structure-based methods, offering flexibility and ease of scene updates, and the results point to promising directions for improving accuracy while maintaining efficiency.
Abstract
Visual localization algorithms, i.e., methods that estimate the camera pose of a query image in a known scene, are core components of many applications, including self-driving cars and augmented / mixed reality systems. State-of-the-art visual localization algorithms are structure-based, i.e., they store a 3D model of the scene and use 2D-3D correspondences between the query image and 3D points in the model for camera pose estimation. While such approaches are highly accurate, they are also rather inflexible when it comes to adjusting the underlying 3D model after changes in the scene. Structureless localization approaches represent the scene as a database of images with known poses and thus offer a much more flexible representation that can be easily updated by adding or removing images. Although there is a large amount of literature on structure-based approaches, there is significantly less work on structureless methods. Hence, this paper is dedicated to providing the, to the best of our knowledge, first comprehensive discussion and comparison of structureless methods. Extensive experiments show that approaches that use a higher degree of classical geometric reasoning generally achieve higher pose accuracy. In particular, approaches based on classical absolute or semi-generalized relative pose estimation outperform very recent methods based on pose regression by a wide margin. Compared with state-of-the-art structure-based approaches, the flexibility of structureless methods comes at the cost of (slightly) lower pose accuracy, indicating an interesting direction for future work.
