Table of Contents
Fetching ...

To Match or Not to Match: Revisiting Image Matching for Reliable Visual Place Recognition

Davide Sferrazza, Gabriele Berton, Gabriele Trivigno, Carlo Masone

TL;DR

This work questions the universal benefit of re-ranking in Visual Place Recognition by showing that state-of-the-art retrieval can saturate benchmarks, sometimes making re-ranking detrimental. It proposes image matching as a verification step to estimate retrieval uncertainty, using inlier counts to decide when re-ranking should be applied. A comprehensive benchmark evaluates many image-matching methods across diverse VPR datasets, revealing that inlier-based uncertainty often correlates with potential gains from re-ranking, especially in challenging conditions, while near-perfect retrieval may degrade with post-processing. The findings advocate for adaptive VPR pipelines that selectively employ matching-based verification, improving robustness and efficiency in real-world localization tasks.

Abstract

Visual Place Recognition (VPR) is a critical task in computer vision, traditionally enhanced by re-ranking retrieval results with image matching. However, recent advancements in VPR methods have significantly improved performance, challenging the necessity of re-ranking. In this work, we show that modern retrieval systems often reach a point where re-ranking can degrade results, as current VPR datasets are largely saturated. We propose using image matching as a verification step to assess retrieval confidence, demonstrating that inlier counts can reliably predict when re-ranking is beneficial. Our findings shift the paradigm of retrieval pipelines, offering insights for more robust and adaptive VPR systems. The code is available at https://github.com/FarInHeight/To-Match-or-Not-to-Match.

To Match or Not to Match: Revisiting Image Matching for Reliable Visual Place Recognition

TL;DR

This work questions the universal benefit of re-ranking in Visual Place Recognition by showing that state-of-the-art retrieval can saturate benchmarks, sometimes making re-ranking detrimental. It proposes image matching as a verification step to estimate retrieval uncertainty, using inlier counts to decide when re-ranking should be applied. A comprehensive benchmark evaluates many image-matching methods across diverse VPR datasets, revealing that inlier-based uncertainty often correlates with potential gains from re-ranking, especially in challenging conditions, while near-perfect retrieval may degrade with post-processing. The findings advocate for adaptive VPR pipelines that selectively employ matching-based verification, improving robustness and efficiency in real-world localization tasks.

Abstract

Visual Place Recognition (VPR) is a critical task in computer vision, traditionally enhanced by re-ranking retrieval results with image matching. However, recent advancements in VPR methods have significantly improved performance, challenging the necessity of re-ranking. In this work, we show that modern retrieval systems often reach a point where re-ranking can degrade results, as current VPR datasets are largely saturated. We propose using image matching as a verification step to assess retrieval confidence, demonstrating that inlier counts can reliably predict when re-ranking is beneficial. Our findings shift the paradigm of retrieval pipelines, offering insights for more robust and adaptive VPR systems. The code is available at https://github.com/FarInHeight/To-Match-or-Not-to-Match.

Paper Structure

This paper contains 13 sections, 10 figures, 6 tables.

Figures (10)

  • Figure 1: Re-ranking with SuperGlue with VPR methods from different years (NetVLAD Arandjelovic_2018_netvlad, SFRS Ge_2020_sfrs, EigenPlaces Berton2023-eigenplaces, MegaLoc Berton_2025_megaloc). In the past, re-ranking the top-$K$ VPR results with powerful image matching methods was guaranteed to improve results. With modern VPR models, this is now true only for certain datasets or types of images. This paper explores this phenomenon, aiming to determine whether re-ranking can be adaptively and confidently triggered for individual queries during deployment.
  • Figure 2: Re-ranking pipeline. The standard re-ranking pipeline consists of first retrieving a shortlist of candidates using a retrieval method, followed by sorting these candidates in descending order based on the number of inliers computed using an image matching method.
  • Figure 3: Example of a case when re-ranking through image matching fails. The top-1 retrieved is shown next to the query on the left, and it's a positive. On the right, the top-2 retrieved image, which is a negative. SuperGlue + RANSAC finds fewer points in common between the pair on the left (only 7 inliers), and more between the wrong pair (26 inliers).
  • Figure 4: Plot displaying the mean Recall@1 after re-ranking and mean latency for different methods. The mean Recall@1 is computed over the datasets, while the mean latency is the average time to process each query over all datasets. The shortlist of candidates for the Recall@1 is obtained with MegaLoc and distance threshold fixed at 25 meters.
  • Figure 5: Precision-Recall curves, computed for the top-4 image matching methods on Tokyo 24/7, SF-XL Night, and SF-XL Occlusion, together with SUE, which is representative of the baselines when the shortlist of candidates is obtained with MegaLoc. Distance threshold is fixed at 25 meters.
  • ...and 5 more figures