Table of Contents
Fetching ...

PCKRF: Point Cloud Completion and Keypoint Refinement With Fusion Data for 6D Pose Estimation

Yiheng Han, Irvin Haozhe Zhan, Long Zeng, Yu-Ping Wang, Ran Yi, Minjing Yu, Matthieu Gaetan Lin, Jenny Sheng, Yong-Jin Liu

TL;DR

The paper tackles the difficulty of achieving high-precision 6D pose estimation under occlusion, symmetry, and incomplete geometry by introducing PCKRF, a two-stage pipeline that first completes partial point clouds via a pose-sensitive network and then refines the pose using Color Supported Iterative KeyPoint (CIKP) refinement. The method fuses RGB-D information through a two-branch feature extractor and trains a keypoint detector to steer completion toward pose-sensitive regions, with a joint loss $L = \alpha L_{kp} + \beta L_c + \gamma L_{cd}$. The core contributions are (i) a novel point cloud completion network with integrated keypoint detection, (ii) a color-aware iterative keypoint refinement that leverages completed geometry, and (iii) a principled integration into existing pose estimation frameworks, improving stability and accuracy on challenging datasets such as YCB-Video and Occlusion LineMOD. Experiments show robust improvements in textureless and symmetric object scenarios and demonstrate that learning-based refiners alone may be less reliable, while PCKRF provides broad compatibility and practical gains for precise 6D pose estimation.

Abstract

Some robust point cloud registration approaches with controllable pose refinement magnitude, such as ICP and its variants, are commonly used to improve 6D pose estimation accuracy. However, the effectiveness of these methods gradually diminishes with the advancement of deep learning techniques and the enhancement of initial pose accuracy, primarily due to their lack of specific design for pose refinement. In this paper, we propose Point Cloud Completion and Keypoint Refinement with Fusion Data (PCKRF), a new pose refinement pipeline for 6D pose estimation. The pipeline consists of two steps. First, it completes the input point clouds via a novel pose-sensitive point completion network. The network uses both local and global features with pose information during point completion. Then, it registers the completed object point cloud with the corresponding target point cloud by our proposed Color supported Iterative KeyPoint (CIKP) method. The CIKP method introduces color information into registration and registers a point cloud around each keypoint to increase stability. The PCKRF pipeline can be integrated with existing popular 6D pose estimation methods, such as the full flow bidirectional fusion network, to further improve their pose estimation accuracy. Experiments demonstrate that our method exhibits superior stability compared to existing approaches when optimizing initial poses with relatively high precision. Notably, the results indicate that our method effectively complements most existing pose estimation techniques, leading to improved performance in most cases. Furthermore, our method achieves promising results even in challenging scenarios involving textureless and symmetrical objects. Our source code is available at https://github.com/zhanhz/KRF.

PCKRF: Point Cloud Completion and Keypoint Refinement With Fusion Data for 6D Pose Estimation

TL;DR

The paper tackles the difficulty of achieving high-precision 6D pose estimation under occlusion, symmetry, and incomplete geometry by introducing PCKRF, a two-stage pipeline that first completes partial point clouds via a pose-sensitive network and then refines the pose using Color Supported Iterative KeyPoint (CIKP) refinement. The method fuses RGB-D information through a two-branch feature extractor and trains a keypoint detector to steer completion toward pose-sensitive regions, with a joint loss . The core contributions are (i) a novel point cloud completion network with integrated keypoint detection, (ii) a color-aware iterative keypoint refinement that leverages completed geometry, and (iii) a principled integration into existing pose estimation frameworks, improving stability and accuracy on challenging datasets such as YCB-Video and Occlusion LineMOD. Experiments show robust improvements in textureless and symmetric object scenarios and demonstrate that learning-based refiners alone may be less reliable, while PCKRF provides broad compatibility and practical gains for precise 6D pose estimation.

Abstract

Some robust point cloud registration approaches with controllable pose refinement magnitude, such as ICP and its variants, are commonly used to improve 6D pose estimation accuracy. However, the effectiveness of these methods gradually diminishes with the advancement of deep learning techniques and the enhancement of initial pose accuracy, primarily due to their lack of specific design for pose refinement. In this paper, we propose Point Cloud Completion and Keypoint Refinement with Fusion Data (PCKRF), a new pose refinement pipeline for 6D pose estimation. The pipeline consists of two steps. First, it completes the input point clouds via a novel pose-sensitive point completion network. The network uses both local and global features with pose information during point completion. Then, it registers the completed object point cloud with the corresponding target point cloud by our proposed Color supported Iterative KeyPoint (CIKP) method. The CIKP method introduces color information into registration and registers a point cloud around each keypoint to increase stability. The PCKRF pipeline can be integrated with existing popular 6D pose estimation methods, such as the full flow bidirectional fusion network, to further improve their pose estimation accuracy. Experiments demonstrate that our method exhibits superior stability compared to existing approaches when optimizing initial poses with relatively high precision. Notably, the results indicate that our method effectively complements most existing pose estimation techniques, leading to improved performance in most cases. Furthermore, our method achieves promising results even in challenging scenarios involving textureless and symmetrical objects. Our source code is available at https://github.com/zhanhz/KRF.
Paper Structure (25 sections, 10 equations, 5 figures, 13 tables, 1 algorithm)

This paper contains 25 sections, 10 equations, 5 figures, 13 tables, 1 algorithm.

Figures (5)

  • Figure 1: Steps of our method: With input RGBD image (a) (the bottom half shows the depth map) and initial pose, we transform the visible point cloud (shown in blue, known object point cloud shown in black) and keypoints (shown in orange, groundtruth keypoints shown in green) to the object coordinate system (b). After completing the visible point cloud and sampling (purple points) around each keypoint within the sphere of radius $r$ (c), we iteratively register purple and black point cloud (d) and get all refined keypoints (shown in red) (e). Then, we use the least squares fitting method to get the final pose. The model transformed by the final pose is shown in (f). It is evident that the refined keypoints are closer to groundtruth than the original keypoints.
  • Figure 2: The upper diagram features the PCKRF pipeline and the lower diagram is the architecture of our point cloud completion network. In the preprocessing step, we utilize the segmentation result and pose of the target object given by the pose estimation network to obtain the partial point cloud in the object coordinate system. The PCKRF pipeline first completes the partial point cloud by the point completion network and then refines the initial pose by our CIKP method. In the point cloud completion network, the Feature Extractor fuses the point cloud and RGB color at each corresponding pixel, and the Keypoint Detector predicts the offset from each point to each keypoint to improve the sensitivity of the completed point cloud to pose accuracy. The loss function of the completion network is a joint optimization of the keypoint detector Loss $L_{kp}$ and the completion decoder Loss $L_{cd}$.
  • Figure 3: Qualitative results on the YCB-Video Dataset. Our method can outperform other methods with both low-precision initial pose (row 1) and high-precision initial pose (rows 2 and 3) with higher accuracy.
  • Figure 4: Qualitative results on the Occlusion LineMOD Dataset. Ours can outperform other methods with both high-precision initial pose (row 1) and low-precision initial pose (rows 2 and 3) with higher accuracy.
  • Figure 5: Incorrect segmentation results lead to poor performance of the ICP method. Valid masks are marked by a red box, and incorrect masks by a white box.