Table of Contents
Fetching ...

HomoMatcher: Dense Feature Matching Results with Semi-Dense Efficiency by Homography Estimation

Xiaolong Wang, Lei Yu, Yingying Zhang, Jiangwei Lao, Lixiang Ru, Liheng Zhong, Jingdong Chen, Yu Zhang, Ming Yang

TL;DR

This paper concentrates on enhancing the fine-matching module in the semi-dense matching framework by employing a lightweight and efficient homography estimation network to generate the perspective mapping between patches obtained from coarse matching.

Abstract

Feature matching between image pairs is a fundamental problem in computer vision that drives many applications, such as SLAM. Recently, semi-dense matching approaches have achieved substantial performance enhancements and established a widely-accepted coarse-to-fine paradigm. However, the majority of existing methods focus on improving coarse feature representation rather than the fine-matching module. Prior fine-matching techniques, which rely on point-to-patch matching probability expectation or direct regression, often lack precision and do not guarantee the continuity of feature points across sequential images. To address this limitation, this paper concentrates on enhancing the fine-matching module in the semi-dense matching framework. We employ a lightweight and efficient homography estimation network to generate the perspective mapping between patches obtained from coarse matching. This patch-to-patch approach achieves the overall alignment of two patches, resulting in a higher sub-pixel accuracy by incorporating additional constraints. By leveraging the homography estimation between patches, we can achieve a dense matching result with low computational cost. Extensive experiments demonstrate that our method achieves higher accuracy compared to previous semi-dense matchers. Meanwhile, our dense matching results exhibit similar end-point-error accuracy compared to previous dense matchers while maintaining semi-dense efficiency.

HomoMatcher: Dense Feature Matching Results with Semi-Dense Efficiency by Homography Estimation

TL;DR

This paper concentrates on enhancing the fine-matching module in the semi-dense matching framework by employing a lightweight and efficient homography estimation network to generate the perspective mapping between patches obtained from coarse matching.

Abstract

Feature matching between image pairs is a fundamental problem in computer vision that drives many applications, such as SLAM. Recently, semi-dense matching approaches have achieved substantial performance enhancements and established a widely-accepted coarse-to-fine paradigm. However, the majority of existing methods focus on improving coarse feature representation rather than the fine-matching module. Prior fine-matching techniques, which rely on point-to-patch matching probability expectation or direct regression, often lack precision and do not guarantee the continuity of feature points across sequential images. To address this limitation, this paper concentrates on enhancing the fine-matching module in the semi-dense matching framework. We employ a lightweight and efficient homography estimation network to generate the perspective mapping between patches obtained from coarse matching. This patch-to-patch approach achieves the overall alignment of two patches, resulting in a higher sub-pixel accuracy by incorporating additional constraints. By leveraging the homography estimation between patches, we can achieve a dense matching result with low computational cost. Extensive experiments demonstrate that our method achieves higher accuracy compared to previous semi-dense matchers. Meanwhile, our dense matching results exhibit similar end-point-error accuracy compared to previous dense matchers while maintaining semi-dense efficiency.

Paper Structure

This paper contains 24 sections, 4 equations, 5 figures, 5 tables.

Figures (5)

  • Figure 1: An visualization of dense matching results from our proposed HomoMatcher and dense matching method RoMa edstedt2024roma. The HomoMatcher operates within a semi-dense framework, maintaining efficiency and enabling the flexible expansion of dense mappings from semi-dense results. Middle row is RoMa's results, which show warps with certainty values above a threshold of 0.02. Bottom row presents our results, demonstrating our method capability for dense matching refinement.
  • Figure 2: An illustration of the proposed method.We use a CNN backbone to extract coarse-level and fine-level features from the given images $I_A$ and $I_B$. Initially, we enhance the coarse features and then obtain the coarse matching result $\mathcal{M}_c$ using the MNN criterion. For each match $(i,j) \in \mathcal{M}_c$, we extract patch pairs of size $w \times w$ from the fine-level features centered at the upsampled position, resulting in $F_A^P$ and $F_B^P$. We estimate the homography $\mathbf{H}$ between matched patches iteratively to refine the subpixel-level matches, yielding the refined matches $\mathcal{M}_f$. We can also obtain the corresponding dense matches $\mathcal{M}_d$.
  • Figure 3: Visualization of sampling a 4D correlation volume using homography estimation $H^{k-1}$. The top row illustrates the process of sampling a 4D correlation volume, which has dimensions $w \times w \times w \times w$, into a $w \times w \times (2r+1) \times (2r +1)$ 4D correlation slice. The bottom row demonstrates how each pixel location is sampled from the correlation patch using a $(2r+1) \times (2r +1)$ window based on pixel mapping results.
  • Figure 4: Visualization of the impact of different expansion radius ($r_e$) on match densification. From left to right, the images show the warp target and the warp results of dense matching obtained with expansion radius of $r_e = 2, 3, 4$. The zoomed details of the spire further illustrate the reliability of our model's densification.
  • Figure 5: Visualization of end-point-error(EPE). The color gradient from blue to red represents increasing EPE.