A Detector-oblivious Multi-arm Network for Keypoint Matching
Xuelun Shen, Qian Hu, Xin Li, Cheng Wang
TL;DR
This work tackles robust, detector-agnostic keypoint matching by introducing a detector-oblivious Multi-Arm Network (MAN) that leverages region-level cues. By incorporating two auxiliary tasks—Overlap Estimation (OVE) and Depth Region Estimation (DEE)—the network learns region-aware features across two images and fuses them via three parallel arms to produce multiple similarity matrices. The detector-oblivious description path ensures compatibility with arbitrary keypoint detectors, enabling retraining-free deployment. Empirical results across outdoor and indoor datasets demonstrate state-of-the-art or competitive performance in pose estimation and visual localization, with ablations illustrating the value of auxiliary tasks and multi-arm design. The approach offers improved reliability in challenging conditions while maintaining practical inference efficiency.
Abstract
This paper presents a matching network to establish point correspondence between images. We propose a Multi-Arm Network (MAN) to learn region overlap and depth, which can greatly improve the keypoint matching robustness while bringing little computational cost during the inference stage. Another design that makes this framework different from many existing learning based pipelines that require re-training when a different keypoint detector is adopted, our network can directly work with different keypoint detectors without such a time-consuming re-training process. Comprehensive experiments conducted on outdoor and indoor datasets demonstrated that our proposed MAN outperforms state-of-the-art methods.
