Mismatched: Evaluating the Limits of Image Matching Approaches and Benchmarks

Sierra Bonilla; Chiara Di Vece; Rema Daher; Xinwei Ju; Danail Stoyanov; Francisco Vasconcelos; Sophia Bano

Mismatched: Evaluating the Limits of Image Matching Approaches and Benchmarks

Sierra Bonilla, Chiara Di Vece, Rema Daher, Xinwei Ju, Danail Stoyanov, Francisco Vasconcelos, Sophia Bano

TL;DR

The paper addresses how well current image matching methods generalize to 3D reconstruction across diverse domains. It implements a consistent three-stage SfM pipeline with edge preprocessing and cross-domain image pairing to compare 20 methods on in-domain Niantic and out-of-domain IMC24 data, using fixed, non-tuned settings. Key findings show limited cross-domain generalization, mixed benefits from edge preprocessing, and substantial ambiguity in the $mAA$ evaluation metric, underscoring the need for clearer reporting and broader benchmarks. The work highlights the practical importance of dataset diversity and careful metric design for advancing robust image matching in real-world SfM tasks.

Abstract

Three-dimensional (3D) reconstruction from two-dimensional images is an active research field in computer vision, with applications ranging from navigation and object tracking to segmentation and three-dimensional modeling. Traditionally, parametric techniques have been employed for this task. However, recent advancements have seen a shift towards learning-based methods. Given the rapid pace of research and the frequent introduction of new image matching methods, it is essential to evaluate them. In this paper, we present a comprehensive evaluation of various image matching methods using a structure-from-motion pipeline. We assess the performance of these methods on both in-domain and out-of-domain datasets, identifying key limitations in both the methods and benchmarks. We also investigate the impact of edge detection as a pre-processing step. Our analysis reveals that image matching for 3D reconstruction remains an open challenge, necessitating careful selection and tuning of models for specific scenarios, while also highlighting mismatches in how metrics currently represent method performance.

Mismatched: Evaluating the Limits of Image Matching Approaches and Benchmarks

TL;DR

evaluation metric, underscoring the need for clearer reporting and broader benchmarks. The work highlights the practical importance of dataset diversity and careful metric design for advancing robust image matching in real-world SfM tasks.

Abstract

Paper Structure (21 sections, 5 figures, 1 table)

This paper contains 21 sections, 5 figures, 1 table.

Introduction
Related Work
Feature Extraction
Feature Matching
Detector-Free Matcher
Training Framework
Methodology
Edge Detection:
Image Pair Generation:
Structure-from-motion:
Experiments
Datasets
Experimental Setup
Metrics
Results & Discussion
...and 6 more sections

Figures (5)

Figure 1: Normalized $mAA$ for out-of-domain (IMC24)image-matching-challenge-2024 versus in-domain (Niantic) datasetsarnold2022map.
Figure 2: Flowchart illustrating our experiment pipeline with the edge detection, pair generation, and structure from motion steps.
Figure 3: (a) In-domain Niantic dataset (21 scenes from validation dataset)arnold2022map. (b) Out-of-domain imc24 dataset (7 scenes from training data)image-matching-challenge-2024. (c) Edge cases from the out-of-domain dataset including occlusions, rotations, aerial and ground acquisitions, seasonal, and illumination changes.
Figure 4: Visualizing $mAA$ for image matching methods on the imc24 scenes that represent various challenging categories.
Figure 5: Heatmap representing the image matching models' performance changes when incorporating edge detection.

Mismatched: Evaluating the Limits of Image Matching Approaches and Benchmarks

TL;DR

Abstract

Mismatched: Evaluating the Limits of Image Matching Approaches and Benchmarks

Authors

TL;DR

Abstract

Table of Contents

Figures (5)