3D Focusing-and-Matching Network for Multi-Instance Point Cloud Registration
Liyuan Zhang, Le Hui, Qi Liu, Bo Li, Yuchao Dai
TL;DR
The paper introduces 3DFMNet, a center-first approach to multi-instance point cloud registration that decomposes the problem into multiple pairwise registrations. It uses a 3D multi-object focusing module to locate object centers and generate proposals, followed by a 3D dual-masking instance matching module to estimate robust pairwise correspondences via instance and overlap masks. The framework employs attention-based feature correlation, ball-query object proposals, and an optimal-transport-based matching mechanism, optimized by dedicated focusing and matching losses. Experiments on Scan2CAD and ROBI demonstrate state-of-the-art performance, with analysis showing the potential upper-bound gains when centers are known, and ablations validating the necessity of both masking components. The work offers practical improvements for scene-CAD alignment in cluttered environments and provides broader insights for tasks like multi-target tracking and map construction.
Abstract
Multi-instance point cloud registration aims to estimate the pose of all instances of a model point cloud in the whole scene. Existing methods all adopt the strategy of first obtaining the global correspondence and then clustering to obtain the pose of each instance. However, due to the cluttered and occluded objects in the scene, it is difficult to obtain an accurate correspondence between the model point cloud and all instances in the scene. To this end, we propose a simple yet powerful 3D focusing-and-matching network for multi-instance point cloud registration by learning the multiple pair-wise point cloud registration. Specifically, we first present a 3D multi-object focusing module to locate the center of each object and generate object proposals. By using self-attention and cross-attention to associate the model point cloud with structurally similar objects, we can locate potential matching instances by regressing object centers. Then, we propose a 3D dual masking instance matching module to estimate the pose between the model point cloud and each object proposal. It performs instance mask and overlap mask masks to accurately predict the pair-wise correspondence. Extensive experiments on two public benchmarks, Scan2CAD and ROBI, show that our method achieves a new state-of-the-art performance on the multi-instance point cloud registration task. Code is available at https://github.com/zlynpu/3DFMNet.
