MGNet: Learning Correspondences via Multiple Graphs
Luanyuan Dai, Xiaoyu Du, Hanwang Zhang, Jinhui Tang
TL;DR
MGNet tackles the problem of learning reliable correspondences from sparse, noisy initial matches by employing a multi-graph framework that jointly builds implicit and explicit local graphs and fuses them into a global graph through Graph Soft Degree Attention ($A^S$, final adjacency $\tilde{A^S}$, and diagonal $D^S$). It introduces a two-stage, verification-based pipeline with a hybrid loss $L = L_c + \beta L_e(E,\hat{E})$ to refine inliers and estimate geometry, guided by a weighted eight-point algorithm for verification. Across camera pose estimation, homography estimation, and visual localization, MGNet achieves state-of-the-art results on diverse datasets while maintaining relatively few parameters, demonstrating strong generalization to unseen data and detectors. The approach leverages long-range relationships among sparse correspondences and shows practical impact for robust visual localization and geometric tasks in challenging conditions.
Abstract
Learning correspondences aims to find correct correspondences (inliers) from the initial correspondence set with an uneven correspondence distribution and a low inlier rate, which can be regarded as graph data. Recent advances usually use graph neural networks (GNNs) to build a single type of graph or simply stack local graphs into the global one to complete the task. But they ignore the complementary relationship between different types of graphs, which can effectively capture potential relationships among sparse correspondences. To address this problem, we propose MGNet to effectively combine multiple complementary graphs. To obtain information integrating implicit and explicit local graphs, we construct local graphs from implicit and explicit aspects and combine them effectively, which is used to build a global graph. Moreover, we propose Graph~Soft~Degree~Attention (GSDA) to make full use of all sparse correspondence information at once in the global graph, which can capture and amplify discriminative features. Extensive experiments demonstrate that MGNet outperforms state-of-the-art methods in different visual tasks. The code is provided in https://github.com/DAILUANYUAN/MGNet-2024AAAI.
