Table of Contents
Fetching ...

BCLNet: Bilateral Consensus Learning for Two-View Correspondence Pruning

Xiangyang Miao, Guobao Xiao, Shiping Wang, Jun Yu

TL;DR

Two-view correspondence pruning is improved by a bilateral consensus learning framework that concurrently models local and global context. BCLNet introduces BCMA for global consensus and OA for local, with BCR for robustness, enabling more reliable inliers and accurate pose estimates. The method achieves state-of-the-art results on correspondence classification and camera pose estimation across datasets such as YFCC100M and SUN3D, including substantial gains on unknown outdoor data, e.g., mAP5° improvements of up to 3.98% over the second-best method, and demonstrates faster training. It also shows robustness across feature extractors like SIFT and SuperPoint, indicating strong generalization.

Abstract

Correspondence pruning aims to establish reliable correspondences between two related images and recover relative camera motion. Existing approaches often employ a progressive strategy to handle the local and global contexts, with a prominent emphasis on transitioning from local to global, resulting in the neglect of interactions between different contexts. To tackle this issue, we propose a parallel context learning strategy that involves acquiring bilateral consensus for the two-view correspondence pruning task. In our approach, we design a distinctive self-attention block to capture global context and parallel process it with the established local context learning module, which enables us to simultaneously capture both local and global consensuses. By combining these local and global consensuses, we derive the required bilateral consensus. We also design a recalibration block, reducing the influence of erroneous consensus information and enhancing the robustness of the model. The culmination of our efforts is the Bilateral Consensus Learning Network (BCLNet), which efficiently estimates camera pose and identifies inliers (true correspondences). Extensive experiments results demonstrate that our network not only surpasses state-of-the-art methods on benchmark datasets but also showcases robust generalization abilities across various feature extraction techniques. Noteworthily, BCLNet obtains 3.98\% mAP5$^{\circ}$ gains over the second best method on unknown outdoor dataset, and obviously accelerates model training speed. The source code will be available at: https://github.com/guobaoxiao/BCLNet.

BCLNet: Bilateral Consensus Learning for Two-View Correspondence Pruning

TL;DR

Two-view correspondence pruning is improved by a bilateral consensus learning framework that concurrently models local and global context. BCLNet introduces BCMA for global consensus and OA for local, with BCR for robustness, enabling more reliable inliers and accurate pose estimates. The method achieves state-of-the-art results on correspondence classification and camera pose estimation across datasets such as YFCC100M and SUN3D, including substantial gains on unknown outdoor data, e.g., mAP5° improvements of up to 3.98% over the second-best method, and demonstrates faster training. It also shows robustness across feature extractors like SIFT and SuperPoint, indicating strong generalization.

Abstract

Correspondence pruning aims to establish reliable correspondences between two related images and recover relative camera motion. Existing approaches often employ a progressive strategy to handle the local and global contexts, with a prominent emphasis on transitioning from local to global, resulting in the neglect of interactions between different contexts. To tackle this issue, we propose a parallel context learning strategy that involves acquiring bilateral consensus for the two-view correspondence pruning task. In our approach, we design a distinctive self-attention block to capture global context and parallel process it with the established local context learning module, which enables us to simultaneously capture both local and global consensuses. By combining these local and global consensuses, we derive the required bilateral consensus. We also design a recalibration block, reducing the influence of erroneous consensus information and enhancing the robustness of the model. The culmination of our efforts is the Bilateral Consensus Learning Network (BCLNet), which efficiently estimates camera pose and identifies inliers (true correspondences). Extensive experiments results demonstrate that our network not only surpasses state-of-the-art methods on benchmark datasets but also showcases robust generalization abilities across various feature extraction techniques. Noteworthily, BCLNet obtains 3.98\% mAP5 gains over the second best method on unknown outdoor dataset, and obviously accelerates model training speed. The source code will be available at: https://github.com/guobaoxiao/BCLNet.
Paper Structure (23 sections, 11 equations, 5 figures, 4 tables)

This paper contains 23 sections, 11 equations, 5 figures, 4 tables.

Figures (5)

  • Figure 1: Bilateral consensus acquisition process. Both local and global contexts are inevitably affected by outliers, neglecting the interaction between them tends to exacerbate the propagation of erroneous information. There may be multiple models in the network that satisfy global constraints (a) and (b). Neighbors based on k-nearest neighbor search also contain many outliers (c). Given a set of putative correspondences (d), we adopt existing blocks and the designed BCMA block (e) to extract local and global consensuses, respectively. Subsequently, we facilitate their interaction to achieve bilateral consensus, which ultimately generates the network's prediction (f). The red lines represent outliers, green lines represent inliers, and blue represents selected correspondences.
  • Figure 2: Architecture of BCLNet for correspondence pruning. We take putative correspondences $C \in R^{N \times 4}$ as inputs and finally output the inlier probabilities $\omega \in R^{N\times 1}$. The entire process involves two pruning modules, systematically refining correspondences into more reliable subsets. Each pruning module consists of our proposed Bilateral Consensus Mining Atttention block, Bilateral Consensus Recalibrate block and existing Order-Aware block.
  • Figure 3: Illustration of the proposed (a) Bilateral Consensus Mining Attention (BCMA) block and (b) Bilateral Consensus Recalibrate (BCR) block.
  • Figure 4: Visualization results of two-view correspondence pruning on the unknown outdoor scenes and unknown indoor scenes. From left to right are the results of RANSAC, CLNet and BCLNet, respectively. Inliers(green lines) and outliers(red lines) retained by algorithms are exhibited.
  • Figure 5: Impact of cluster number on BCLNet performance.