Table of Contents
Fetching ...

PCR-CG: Point Cloud Registration via Deep Explicit Color and Geometry

Yu Zhang, Junle Yu, Xiaolin Huang, Wenhui Zhou, Ji Hou

TL;DR

PCR-CG tackles RGB-D point cloud registration by introducing a 2D-3D projection module that explicitly lifts deep color features into 3D geometry. Built on a Predator-based backbone, it leverages pre-trained 2D networks to provide region-based color cues, which are fused with geometry through explicit region-wise projection guided by camera intrinsics. The approach achieves notable improvements on 3DLoMatch ($RR$ gains of up to $6.5\%$) and consistently boosts SOTA methods such as GeoTransformer and CoFiNet, demonstrating the value of cross-modality color-geometry learning for low-level registration tasks. The work also shows strong transferability from 2D pre-trained models to 3D registration, suggesting broad applicability beyond the tested datasets.

Abstract

In this paper, we introduce PCR-CG: a novel 3D point cloud registration module explicitly embedding the color signals into the geometry representation. Different from previous methods that only use geometry representation, our module is specifically designed to effectively correlate color into geometry for the point cloud registration task. Our key contribution is a 2D-3D cross-modality learning algorithm that embeds the deep features learned from color signals to the geometry representation. With our designed 2D-3D projection module, the pixel features in a square region centered at correspondences perceived from images are effectively correlated with point clouds. In this way, the overlapped regions can be inferred not only from point cloud but also from the texture appearances. Adding color is non-trivial. We compare against a variety of baselines designed for adding color to 3D, such as exhaustively adding per-pixel features or RGB values in an implicit manner. We leverage Predator [25] as the baseline method and incorporate our proposed module onto it. To validate the effectiveness of 2D features, we ablate different 2D pre-trained networks and show a positive correlation between the pre-trained weights and the task performance. Our experimental results indicate a significant improvement of 6.5% registration recall over the baseline method on the 3DLoMatch benchmark. We additionally evaluate our approach on SOTA methods and observe consistent improvements, such as an improvement of 2.4% registration recall over GeoTransformer as well as 3.5% over CoFiNet. Our study reveals a significant advantages of correlating explicit deep color features to the point cloud in the registration task.

PCR-CG: Point Cloud Registration via Deep Explicit Color and Geometry

TL;DR

PCR-CG tackles RGB-D point cloud registration by introducing a 2D-3D projection module that explicitly lifts deep color features into 3D geometry. Built on a Predator-based backbone, it leverages pre-trained 2D networks to provide region-based color cues, which are fused with geometry through explicit region-wise projection guided by camera intrinsics. The approach achieves notable improvements on 3DLoMatch ( gains of up to ) and consistently boosts SOTA methods such as GeoTransformer and CoFiNet, demonstrating the value of cross-modality color-geometry learning for low-level registration tasks. The work also shows strong transferability from 2D pre-trained models to 3D registration, suggesting broad applicability beyond the tested datasets.

Abstract

In this paper, we introduce PCR-CG: a novel 3D point cloud registration module explicitly embedding the color signals into the geometry representation. Different from previous methods that only use geometry representation, our module is specifically designed to effectively correlate color into geometry for the point cloud registration task. Our key contribution is a 2D-3D cross-modality learning algorithm that embeds the deep features learned from color signals to the geometry representation. With our designed 2D-3D projection module, the pixel features in a square region centered at correspondences perceived from images are effectively correlated with point clouds. In this way, the overlapped regions can be inferred not only from point cloud but also from the texture appearances. Adding color is non-trivial. We compare against a variety of baselines designed for adding color to 3D, such as exhaustively adding per-pixel features or RGB values in an implicit manner. We leverage Predator [25] as the baseline method and incorporate our proposed module onto it. To validate the effectiveness of 2D features, we ablate different 2D pre-trained networks and show a positive correlation between the pre-trained weights and the task performance. Our experimental results indicate a significant improvement of 6.5% registration recall over the baseline method on the 3DLoMatch benchmark. We additionally evaluate our approach on SOTA methods and observe consistent improvements, such as an improvement of 2.4% registration recall over GeoTransformer as well as 3.5% over CoFiNet. Our study reveals a significant advantages of correlating explicit deep color features to the point cloud in the registration task.
Paper Structure (10 sections, 5 equations, 7 figures, 10 tables)

This paper contains 10 sections, 5 equations, 7 figures, 10 tables.

Figures (7)

  • Figure 1: We seek to align two point clouds in RGB-D data. To better leverage RGB information, we propose PCR-CG, a 2D-3D projection module that explicitly lifts 2D deep color features to 3D geometry representation. A pair of RGB-D frames are used as input, where each RGB-D frame is composed of a color image and a depth frame. 3D geometry is represented by the point cloud that is generated from depth frame. We leverage a pre-trained 2D network to predict correspondences between frames and extract regional features from color images. The 2D regional features are further lifted to 3D via our proposed 2D-3D projection module in an explicit manner. Our code is open sourced at: https://github.com/Gardlin/PCR-CG
  • Figure 2: PCR-CG Pipeline. The pipeline is composed of a 3D network, a 2D network and a 2D-3D projection module. Both 3D geometry and 2D images are taken as input and used to jointly learn features for detecting correspondences. The 2D network takes RGB images as input and extracts per-region features. A 2D-3D Projection Module is used to lift 2D pixel features into 3D point cloud. The concatenated features are fed into 3D network for finding correspondences. Due to our 2D-3D projection module, the 3D supervision can pass gradients back to the 2D network, and, therefore, yield an end-to-end training.
  • Figure 3: Color Coverage Ablation. Non-black points could take 2D features; black points indicate no features taken from 2D. We observe that with one image, not every point can be associated with color features; with two views, most points have the coverage of projected color features. However, adding the third view does not significantly improve the cover coverage. Therefore, our approach takes two views as the default setup.
  • Figure 4: 2D-3D Projection Module. We introduce a novel 2D-3D projection module to lift deep color features into 3D. The module takes the transformation matrix and depth map to project regional features to point cloud.
  • Figure 5: Qualitative Comparisons on 3DLoMatch. We demonstrate visual comparisons between PCR-CG and other SOTA methods. With the help of PCR-CG, the registration results are much more accurate.
  • ...and 2 more figures