Table of Contents
Fetching ...

Local Feature Matching Using Deep Learning: A Survey

Shibiao Xu, Shunpeng Chen, Rongtao Xu, Changwei Wang, Peng Lu, Li Guo

TL;DR

The paper also explores the practical application of local feature matching in diverse domains such as Structure from Motion, Remote Sensing Image Registration, and Medical Image Registration, underscoring its versatility and significance across various fields.

Abstract

Local feature matching enjoys wide-ranging applications in the realm of computer vision, encompassing domains such as image retrieval, 3D reconstruction, and object recognition. However, challenges persist in improving the accuracy and robustness of matching due to factors like viewpoint and lighting variations. In recent years, the introduction of deep learning models has sparked widespread exploration into local feature matching techniques. The objective of this endeavor is to furnish a comprehensive overview of local feature matching methods. These methods are categorized into two key segments based on the presence of detectors. The Detector-based category encompasses models inclusive of Detect-then-Describe, Joint Detection and Description, Describe-then-Detect, as well as Graph Based techniques. In contrast, the Detector-free category comprises CNN Based, Transformer Based, and Patch Based methods. Our study extends beyond methodological analysis, incorporating evaluations of prevalent datasets and metrics to facilitate a quantitative comparison of state-of-the-art techniques. The paper also explores the practical application of local feature matching in diverse domains such as Structure from Motion, Remote Sensing Image Registration, and Medical Image Registration, underscoring its versatility and significance across various fields. Ultimately, we endeavor to outline the current challenges faced in this domain and furnish future research directions, thereby serving as a reference for researchers involved in local feature matching and its interconnected domains. A comprehensive list of studies in this survey is available at https://github.com/vignywang/Awesome-Local-Feature-Matching .

Local Feature Matching Using Deep Learning: A Survey

TL;DR

The paper also explores the practical application of local feature matching in diverse domains such as Structure from Motion, Remote Sensing Image Registration, and Medical Image Registration, underscoring its versatility and significance across various fields.

Abstract

Local feature matching enjoys wide-ranging applications in the realm of computer vision, encompassing domains such as image retrieval, 3D reconstruction, and object recognition. However, challenges persist in improving the accuracy and robustness of matching due to factors like viewpoint and lighting variations. In recent years, the introduction of deep learning models has sparked widespread exploration into local feature matching techniques. The objective of this endeavor is to furnish a comprehensive overview of local feature matching methods. These methods are categorized into two key segments based on the presence of detectors. The Detector-based category encompasses models inclusive of Detect-then-Describe, Joint Detection and Description, Describe-then-Detect, as well as Graph Based techniques. In contrast, the Detector-free category comprises CNN Based, Transformer Based, and Patch Based methods. Our study extends beyond methodological analysis, incorporating evaluations of prevalent datasets and metrics to facilitate a quantitative comparison of state-of-the-art techniques. The paper also explores the practical application of local feature matching in diverse domains such as Structure from Motion, Remote Sensing Image Registration, and Medical Image Registration, underscoring its versatility and significance across various fields. Ultimately, we endeavor to outline the current challenges faced in this domain and furnish future research directions, thereby serving as a reference for researchers involved in local feature matching and its interconnected domains. A comprehensive list of studies in this survey is available at https://github.com/vignywang/Awesome-Local-Feature-Matching .
Paper Structure (42 sections, 1 equation, 7 figures, 9 tables)

This paper contains 42 sections, 1 equation, 7 figures, 9 tables.

Figures (7)

  • Figure 1: Matching results for outdoor images. It can be observed that for images with significant variations in viewpoint and lighting conditions, the matching task encounters considerable challenges.
  • Figure 2: Representative local feature matching methods. Blue and gray represent Detector-based Models, where gray represents the Graph Based method. The yellow and green blocks represent the CNN Based and Transformer Based methods in Detector-free Models, respectively.In 2018, Superpoint detone2018superpoint pioneered the computation of keypoints and descriptors within a single network. Subsequently, numerous works such as D2Net dusmanu2019d2, R2D2 revaud2019r2d2, and others attempted to integrate keypoint detection and description for matching purposes. Concurrently, the NCNet rocco2018neighbourhood method introduced four-dimensional cost volumes into local feature matching, initiating a trend in utilizing correlation-based or cost volume-based convolutional neural networks for Detector-free matching research. Building upon this trend, methods like Sparse-NCNet rocco2020efficient, DRC-Net li2020dual, GLU-Net truong2020glu, and PDC-Net truong2021learning emerged. In 2020, SuperGlue sarlin2020superglue framed the task as a graph matching problem involving two sets of features. Following this, SGMNet chen2021learning and ClusterGNN shi2022clustergnn focused on improving the graph matching process by addressing the complexity of matching. In 2021, approaches such as LoFTR sun2021loftr and Aspanformer chen2022aspanformer successfully incorporated Transformer or Attention mechanisms into the Detector-free matching process. They achieved this by employing interleaved self and cross-attention modules, significantly expanding the receptive field and further advancing deep learning-based matching techniques.
  • Figure 3: Overview of the Local Feature Matching Models and taxonomy of the most relevant approaches.
  • Figure 4: The comparison of various prominent Detector-based pipelines for trainable local feature matching is presented. Here, the categorization is based on the relationship between the detection and description steps: (a) Detect-then-Describe framework, (b) Joint Detection and Description framework, and (c) Describe-then-Detect framework.
  • Figure 5: General GNN Matching Model Architecture. Firstly, keypoint positions $p_i$ along with their visual descriptors $d_i$ are mapped into individual vectors. Subsequently, self-attention layers and cross-attention layers are thereafter applied alternately, L times, within a graph neural network to create enhanced matching descriptors. Finally, the Sinkhorn Algorithm is utilized to determine the optimal partial assignment.
  • ...and 2 more figures