On the Estimation of Image-matching Uncertainty in Visual Place Recognition

Mubariz Zaffar; Liangliang Nan; Julian F. P. Kooij

On the Estimation of Image-matching Uncertainty in Visual Place Recognition

Mubariz Zaffar, Liangliang Nan, Julian F. P. Kooij

TL;DR

This work compares for the first time the main approaches for estimating the image-matching uncertainty, including the traditional retrieval-based uncertainty estimation, more recent data-driven aleatoric uncertainty estimation, and the compute-intensive geometric verification approach, and formulate a simple baseline method, “SUE”, which outperforms the other efficient uncertainty estimation methods.

Abstract

In Visual Place Recognition (VPR) the pose of a query image is estimated by comparing the image to a map of reference images with known reference poses. As is typical for image retrieval problems, a feature extractor maps the query and reference images to a feature space, where a nearest neighbor search is then performed. However, till recently little attention has been given to quantifying the confidence that a retrieved reference image is a correct match. Highly certain but incorrect retrieval can lead to catastrophic failure of VPR-based localization pipelines. This work compares for the first time the main approaches for estimating the image-matching uncertainty, including the traditional retrieval-based uncertainty estimation, more recent data-driven aleatoric uncertainty estimation, and the compute-intensive geometric verification. We further formulate a simple baseline method, ``SUE'', which unlike the other methods considers the freely-available poses of the reference images in the map. Our experiments reveal that a simple L2-distance between the query and reference descriptors is already a better estimate of image-matching uncertainty than current data-driven approaches. SUE outperforms the other efficient uncertainty estimation methods, and its uncertainty estimates complement the computationally expensive geometric verification approach. Future works for uncertainty estimation in VPR should consider the baselines discussed in this work.

On the Estimation of Image-matching Uncertainty in Visual Place Recognition

TL;DR

Abstract

Paper Structure (28 sections, 8 equations, 14 figures, 7 tables)

This paper contains 28 sections, 8 equations, 14 figures, 7 tables.

Introduction
Related work
Methodology
Uncertainty estimation in VPR
Formalizing VPR
Current VPR uncertainty estimation categories
Spatial uncertainty estimation (SUE) for VPR
Complementing geometric verification
Experiments
Experimental setup
Performance comparison
Complementing geometric verification
Ablation study
Discussion
Conclusions
...and 13 more sections

Figures (14)

Figure 1: The Precision-Recall curves on the Pittsburgh dataset arandjelovic2016netvlad for the three common categories of VPR uncertainty estimation methods (RUE, DUE, GV), and for our proposed baseline SUE which uniquely considers spatial locations of the top-K references. The global image descriptors cai2022stun are fixed for all methods except BTL warburg2021bayesian. The only difference is the confidence given by each uncertainty estimation method to the best-matched reference descriptors for the corresponding queries. The legend lists the Area-under-the-Precision-Recall-curves. As GV methods are two to three orders of magnitude more computationally expensive than the others, they are plotted as dotted lines. Surprisingly, simple L2-distance in feature space is a better estimate of VPR uncertainty than recent deep learning-based uncertainty estimates. SUE outperforms all other efficient uncertainty estimation methods.
Figure 2: In VPR, a query $q$ is compared in feature space to features $f_i \in \mathcal{R}{}$ of reference images with known poses. The nearest neighbors $f_{(1)}, \cdots, f_{(K)}$ are retrieved as matches. Left: The retrieved references $I_{(1)}, I_{(2)}, I_{(3)}$ share similar visual content with the query (walls, pillars, and blobs), but are geographically far apart, reflecting high uncertainty that the matched reference is correct. Right: For another query, the retrieved references are geographically close together, indicating low uncertainty.
Figure 3: Examples of the two least and the two most uncertain query images with the corresponding nearest neighbor on the Pittsburgh dataset. The colors/symbols indicate whether the retrieved image is a correct match.
Figure 4: Two queries and their nearest neighbor reference images that illustrate cases where SUE outperforms other methods. Ideally a method assigns high uncertainty to the mismatched query and low uncertainty to the correct match, as SUE does here.
Figure 5: The relation between geometric verification uncertainty (x-axis) and the L2/STUN/SUE uncertainty (y-axis) on the Pittsburgh dataset arandjelovic2016netvlad. Each point represents a query, with blue indicating a correct match, and red otherwise. The linear SVM boundaries are shown as black lines, while the dashed lines are the SVM margins. Scores have been linearly scaled to the $[0,1]$ range based on the min/max value in the training data, and for better visualization the vertical scale is in log-space, hence the SVM boundaries appear non-linear. The class distributions in the right-most plot reveal that SUE complements geometric-verification, especially when the latter has low confidence.
...and 9 more figures

On the Estimation of Image-matching Uncertainty in Visual Place Recognition

TL;DR

Abstract

On the Estimation of Image-matching Uncertainty in Visual Place Recognition

Authors

TL;DR

Abstract

Table of Contents

Figures (14)