Table of Contents
Fetching ...

MGNet: Learning Correspondences via Multiple Graphs

Luanyuan Dai, Xiaoyu Du, Hanwang Zhang, Jinhui Tang

TL;DR

MGNet tackles the problem of learning reliable correspondences from sparse, noisy initial matches by employing a multi-graph framework that jointly builds implicit and explicit local graphs and fuses them into a global graph through Graph Soft Degree Attention ($A^S$, final adjacency $\tilde{A^S}$, and diagonal $D^S$). It introduces a two-stage, verification-based pipeline with a hybrid loss $L = L_c + \beta L_e(E,\hat{E})$ to refine inliers and estimate geometry, guided by a weighted eight-point algorithm for verification. Across camera pose estimation, homography estimation, and visual localization, MGNet achieves state-of-the-art results on diverse datasets while maintaining relatively few parameters, demonstrating strong generalization to unseen data and detectors. The approach leverages long-range relationships among sparse correspondences and shows practical impact for robust visual localization and geometric tasks in challenging conditions.

Abstract

Learning correspondences aims to find correct correspondences (inliers) from the initial correspondence set with an uneven correspondence distribution and a low inlier rate, which can be regarded as graph data. Recent advances usually use graph neural networks (GNNs) to build a single type of graph or simply stack local graphs into the global one to complete the task. But they ignore the complementary relationship between different types of graphs, which can effectively capture potential relationships among sparse correspondences. To address this problem, we propose MGNet to effectively combine multiple complementary graphs. To obtain information integrating implicit and explicit local graphs, we construct local graphs from implicit and explicit aspects and combine them effectively, which is used to build a global graph. Moreover, we propose Graph~Soft~Degree~Attention (GSDA) to make full use of all sparse correspondence information at once in the global graph, which can capture and amplify discriminative features. Extensive experiments demonstrate that MGNet outperforms state-of-the-art methods in different visual tasks. The code is provided in https://github.com/DAILUANYUAN/MGNet-2024AAAI.

MGNet: Learning Correspondences via Multiple Graphs

TL;DR

MGNet tackles the problem of learning reliable correspondences from sparse, noisy initial matches by employing a multi-graph framework that jointly builds implicit and explicit local graphs and fuses them into a global graph through Graph Soft Degree Attention (, final adjacency , and diagonal ). It introduces a two-stage, verification-based pipeline with a hybrid loss to refine inliers and estimate geometry, guided by a weighted eight-point algorithm for verification. Across camera pose estimation, homography estimation, and visual localization, MGNet achieves state-of-the-art results on diverse datasets while maintaining relatively few parameters, demonstrating strong generalization to unseen data and detectors. The approach leverages long-range relationships among sparse correspondences and shows practical impact for robust visual localization and geometric tasks in challenging conditions.

Abstract

Learning correspondences aims to find correct correspondences (inliers) from the initial correspondence set with an uneven correspondence distribution and a low inlier rate, which can be regarded as graph data. Recent advances usually use graph neural networks (GNNs) to build a single type of graph or simply stack local graphs into the global one to complete the task. But they ignore the complementary relationship between different types of graphs, which can effectively capture potential relationships among sparse correspondences. To address this problem, we propose MGNet to effectively combine multiple complementary graphs. To obtain information integrating implicit and explicit local graphs, we construct local graphs from implicit and explicit aspects and combine them effectively, which is used to build a global graph. Moreover, we propose Graph~Soft~Degree~Attention (GSDA) to make full use of all sparse correspondence information at once in the global graph, which can capture and amplify discriminative features. Extensive experiments demonstrate that MGNet outperforms state-of-the-art methods in different visual tasks. The code is provided in https://github.com/DAILUANYUAN/MGNet-2024AAAI.
Paper Structure (39 sections, 14 equations, 5 figures, 11 tables)

This paper contains 39 sections, 14 equations, 5 figures, 11 tables.

Figures (5)

  • Figure 1: Graph Soft Degree Attention, in which $A^S$, $\tilde{A^S}$ and $D^S$ represent Soft Adjacent Matrix, the final Soft Adjacent Matrix and Soft Degree Matrix, respectively. Combing with the like-probability value (white to red and then to blue is from 0 to 1 and then to 2), it can prove that Soft Degree Matrix $D^S$ can capture and amplify discriminative features.
  • Figure 2: Network architecture of MGNet. The input is a putative correspondence set $C$, and the output is the final probability set $P$. $i$ = $1$, $2$.
  • Figure 3: Partial typical visualization results on YFCC100M and SUN3D datasets with SIFT. From top to bottom: input image pairs, results of CLNet and our MGNet. The green lines describe inliers, the red lines otherwise.
  • Figure 4: Relationship between mAP$5^{\circ}(\%)$ and network parameter number with different cluster number $m$.
  • Figure 5: Parametric analysis of $k$ in the explicit local graph.