HccePose(BF): Predicting Front & Back Surfaces to Construct Ultra-Dense 2D-3D Correspondences for Pose Estimation
Yulin Wang, Mengting Hu, Hongli Li, Chen Luo
TL;DR
This work tackles seen-object pose estimation by predicting coordinates for both the object’s front and back surfaces and densely sampling points between them to create ultra-dense 2D-3D correspondences used by a RANSAC-PnP solver. A key contribution is Hierarchical Continuous Coordinate Encoding (HCCE), which represents surface coordinates as multi-level continuous codes and uses a histogram-based scheme to adapt learning weights across levels, improving stability and accuracy. Empirically, the method achieves competitive BOP scores on seven core datasets and outperforms state-of-the-art RGB-based methods by up to 2.4% in BOP score, with further gains when RGB-D data are involved, and also improves 2D segmentation accuracy. The approach emphasizes dual-surface information and dense sampling to strengthen pose estimation, offering practical improvements for industrial and robotics applications, though the model remains object-specific rather than universally generalizable.
Abstract
In pose estimation for seen objects, a prevalent pipeline involves using neural networks to predict dense 3D coordinates of the object surface on 2D images, which are then used to establish dense 2D-3D correspondences. However, current methods primarily focus on more efficient encoding techniques to improve the precision of predicted 3D coordinates on the object's front surface, overlooking the potential benefits of incorporating the back surface and interior of the object. To better utilize the full surface and interior of the object, this study predicts 3D coordinates of both the object's front and back surfaces and densely samples 3D coordinates between them. This process creates ultra-dense 2D-3D correspondences, effectively enhancing pose estimation accuracy based on the Perspective-n-Point (PnP) algorithm. Additionally, we propose Hierarchical Continuous Coordinate Encoding (HCCE) to provide a more accurate and efficient representation of front and back surface coordinates. Experimental results show that, compared to existing state-of-the-art (SOTA) methods on the BOP website, the proposed approach outperforms across seven classic BOP core datasets. Code is available at https://github.com/WangYuLin-SEU/HCCEPose.
