A Survey on Monocular Re-Localization: From the Perspective of Scene Map Representation

Jinyu Miao; Kun Jiang; Tuopu Wen; Yunlong Wang; Peijing Jia; Xuhe Zhao; Qian Cheng; Zhongyang Xiao; Jin Huang; Zhihua Zhong; Diange Yang

A Survey on Monocular Re-Localization: From the Perspective of Scene Map Representation

Jinyu Miao, Kun Jiang, Tuopu Wen, Yunlong Wang, Peijing Jia, Xuhe Zhao, Qian Cheng, Zhongyang Xiao, Jin Huang, Zhihua Zhong, Diange Yang

TL;DR

This survey reframes monocular re-localization (MRL) as an interaction between a monocular query and a scene map, and systematically classifies methods by the representation form of the scene map into five categories: geo-tagged frame maps, visual landmark maps, point cloud maps, vectorized HD maps, and learnt implicit maps. It analyzes core components—visual place recognition, relative pose estimation, feature extraction/matching, and pose solvers—within each map category, surveys public benchmarks and datasets for coarse and fine localization, and discusses the strengths, limitations, and practical considerations of each representation. The paper also highlights frontier topics such as end-to-end localization, cross-modal and cross-domain localization, map compression, and neural implicit maps, and provides a living repository to track progress. The work serves as a comprehensive reference for researchers and practitioners, guiding method selection and future research in robust, scalable monocular localization for autonomous systems.

Abstract

Monocular Re-Localization (MRL) is a critical component in autonomous applications, estimating 6 degree-of-freedom ego poses w.r.t. the scene map based on monocular images. In recent decades, significant progress has been made in the development of MRL techniques. Numerous algorithms have accomplished extraordinary success in terms of localization accuracy and robustness. In MRL, scene maps are represented in various forms, and they determine how MRL methods work and how MRL methods perform. However, to the best of our knowledge, existing surveys do not provide systematic reviews about the relationship between MRL solutions and their used scene map representation. This survey fills the gap by comprehensively reviewing MRL methods from such a perspective, promoting further research. 1) We commence by delving into the problem definition of MRL, exploring current challenges, and comparing ours with existing surveys. 2) Many well-known MRL methods are categorized and reviewed into five classes according to the representation forms of utilized map, i.e., geo-tagged frames, visual landmarks, point clouds, vectorized semantic map, and neural network-based map. 3) To quantitatively and fairly compare MRL methods with various map, we introduce some public datasets and provide the performances of some state-of-the-art MRL methods. The strengths and weakness of MRL methods with different map are analyzed. 4) We finally introduce some topics of interest in this field and give personal opinions. This survey can serve as a valuable referenced materials for MRL, and a continuously updated summary of this survey is publicly available to the community at: https://github.com/jinyummiao/map-in-mono-reloc.

A Survey on Monocular Re-Localization: From the Perspective of Scene Map Representation

TL;DR

Abstract

Paper Structure (36 sections, 19 equations, 16 figures, 10 tables)

This paper contains 36 sections, 19 equations, 16 figures, 10 tables.

Introduction
Background
Problem Formulation and Symbols Definition
Challenges of Monocular Re-Localization
Comparison with other surveys
Geo-tagged Frame Map
Visual Place Recognition
Relative Pose Estimation
Visual Landmark Map
Local Feature Extraction-then-Matching
Joint Local Feature Extraction and Matching
Pose Solver
Further Improvements
Point Cloud Map
Geometry-based Cross-modal Localization
...and 21 more sections

Figures (16)

Figure 1: An overview of the development in MRL. a) published papers titled by "visual re-localization" in each year, b) some of landmarks in MRL researches.
Figure 2: The structure of this survey.
Figure 3: Typical challenges of MRL methods. a) day-night changes with varying illumination tokyo247, b) seasonal and weather changes (sunny in summer v.s. snowy in winter) seqslam, c) viewpoint changes university1652, d) distinct places with visually similar appearances (perceptual aliasing problem) malaga.
Figure 4: A diagram of MRL methods using geo-tagged frames as map, including a) VPR and b) RPR methods.
Figure 5: Four kinds of image features in VPR, including a) image global feature, b) image global feature, c) image sequence feature, and d) image semantics.
...and 11 more figures

A Survey on Monocular Re-Localization: From the Perspective of Scene Map Representation

TL;DR

Abstract

A Survey on Monocular Re-Localization: From the Perspective of Scene Map Representation

Authors

TL;DR

Abstract

Table of Contents

Figures (16)