PCR-CG: Point Cloud Registration via Deep Explicit Color and Geometry
Yu Zhang, Junle Yu, Xiaolin Huang, Wenhui Zhou, Ji Hou
TL;DR
PCR-CG tackles RGB-D point cloud registration by introducing a 2D-3D projection module that explicitly lifts deep color features into 3D geometry. Built on a Predator-based backbone, it leverages pre-trained 2D networks to provide region-based color cues, which are fused with geometry through explicit region-wise projection guided by camera intrinsics. The approach achieves notable improvements on 3DLoMatch ($RR$ gains of up to $6.5\%$) and consistently boosts SOTA methods such as GeoTransformer and CoFiNet, demonstrating the value of cross-modality color-geometry learning for low-level registration tasks. The work also shows strong transferability from 2D pre-trained models to 3D registration, suggesting broad applicability beyond the tested datasets.
Abstract
In this paper, we introduce PCR-CG: a novel 3D point cloud registration module explicitly embedding the color signals into the geometry representation. Different from previous methods that only use geometry representation, our module is specifically designed to effectively correlate color into geometry for the point cloud registration task. Our key contribution is a 2D-3D cross-modality learning algorithm that embeds the deep features learned from color signals to the geometry representation. With our designed 2D-3D projection module, the pixel features in a square region centered at correspondences perceived from images are effectively correlated with point clouds. In this way, the overlapped regions can be inferred not only from point cloud but also from the texture appearances. Adding color is non-trivial. We compare against a variety of baselines designed for adding color to 3D, such as exhaustively adding per-pixel features or RGB values in an implicit manner. We leverage Predator [25] as the baseline method and incorporate our proposed module onto it. To validate the effectiveness of 2D features, we ablate different 2D pre-trained networks and show a positive correlation between the pre-trained weights and the task performance. Our experimental results indicate a significant improvement of 6.5% registration recall over the baseline method on the 3DLoMatch benchmark. We additionally evaluate our approach on SOTA methods and observe consistent improvements, such as an improvement of 2.4% registration recall over GeoTransformer as well as 3.5% over CoFiNet. Our study reveals a significant advantages of correlating explicit deep color features to the point cloud in the registration task.
