Table of Contents
Fetching ...

Incremental Multiview Point Cloud Registration with Two-stage Candidate Retrieval

Shiqi Li, Jihua Zhu, Yifan Xie, Mingchen Zhu

TL;DR

This work tackles multiview point cloud registration by moving beyond brittle global pose-graph optimization to an incremental strategy that builds a growing meta-shape. It introduces a two-stage coarse-to-fine frame retrieval that first leverages global semantic features and then geometric matching, followed by single transformation averaging to reduce drift, and a Reservoir sampling-based meta-update to handle density variance. The approach demonstrates state-of-the-art registration recalls on 3DMatch/3DLoMatch and strong results on ScanNet across different graph sparsities, validating both robustness and generalization. By combining semantic and geometric cues with density-aware updates, the method offers a scalable, accurate alternative for real-world multiview registration tasks with varying overlap and density.

Abstract

Multiview point cloud registration serves as a cornerstone of various computer vision tasks. Previous approaches typically adhere to a global paradigm, where a pose graph is initially constructed followed by motion synchronization to determine the absolute pose. However, this separated approach may not fully leverage the characteristics of multiview registration and might struggle with low-overlap scenarios. In this paper, we propose an incremental multiview point cloud registration method that progressively registers all scans to a growing meta-shape. To determine the incremental ordering, we employ a two-stage coarse-to-fine strategy for point cloud candidate retrieval. The first stage involves the coarse selection of scans based on neighbor fusion-enhanced global aggregation features, while the second stage further reranks candidates through geometric-based matching. Additionally, we apply a transformation averaging technique to mitigate accumulated errors during the registration process. Finally, we utilize a Reservoir sampling-based technique to address density variance issues while reducing computational load. Comprehensive experimental results across various benchmarks validate the effectiveness and generalization of our approach.

Incremental Multiview Point Cloud Registration with Two-stage Candidate Retrieval

TL;DR

This work tackles multiview point cloud registration by moving beyond brittle global pose-graph optimization to an incremental strategy that builds a growing meta-shape. It introduces a two-stage coarse-to-fine frame retrieval that first leverages global semantic features and then geometric matching, followed by single transformation averaging to reduce drift, and a Reservoir sampling-based meta-update to handle density variance. The approach demonstrates state-of-the-art registration recalls on 3DMatch/3DLoMatch and strong results on ScanNet across different graph sparsities, validating both robustness and generalization. By combining semantic and geometric cues with density-aware updates, the method offers a scalable, accurate alternative for real-world multiview registration tasks with varying overlap and density.

Abstract

Multiview point cloud registration serves as a cornerstone of various computer vision tasks. Previous approaches typically adhere to a global paradigm, where a pose graph is initially constructed followed by motion synchronization to determine the absolute pose. However, this separated approach may not fully leverage the characteristics of multiview registration and might struggle with low-overlap scenarios. In this paper, we propose an incremental multiview point cloud registration method that progressively registers all scans to a growing meta-shape. To determine the incremental ordering, we employ a two-stage coarse-to-fine strategy for point cloud candidate retrieval. The first stage involves the coarse selection of scans based on neighbor fusion-enhanced global aggregation features, while the second stage further reranks candidates through geometric-based matching. Additionally, we apply a transformation averaging technique to mitigate accumulated errors during the registration process. Finally, we utilize a Reservoir sampling-based technique to address density variance issues while reducing computational load. Comprehensive experimental results across various benchmarks validate the effectiveness and generalization of our approach.
Paper Structure (20 sections, 12 equations, 5 figures, 4 tables, 1 algorithm)

This paper contains 20 sections, 12 equations, 5 figures, 4 tables, 1 algorithm.

Figures (5)

  • Figure 1: An example comprising frame1, frame2, and frame3 within an indoor aisle. While frame1 and frame3 share some overlapping areas with frame2, these common areas predominantly consist of flat floors, which offer limited cues for registration. However, frame1 and frame3 share sufficient structurally significant areas that enable successful alignment. The motion synchronization mechanism cannot handle the pose graph, as depicted in the top right corner, due to the absence of a correct connection to the frame2. Nevertheless, our incremental paradigm first merges frame1 and frame3 to expand the meaningful overlapping areas with frame2, ultimately achieving a complete scene, as illustrated in the bottom right corner.
  • Figure 2: Overview of our pipeline: Point cloud frames are progressively incorporated into the same coordinate system, referred to as the meta-shape in this study. For each frame, the process entails two-stage frame retrieval, transformation refinement, and meta-information update.
  • Figure 3: Left: Illustration depicting the construction of the similarity matrix. Right: Workflow of the proposed two-stage frame retrieval process.
  • Figure 4: An illustration of the meta information update process.
  • Figure 5: Qualitative comparison results.