DKPMV: Dense Keypoints Fusion from Multi-View RGB Frames for 6D Pose Estimation of Textureless Objects

Jiahong Chen; Jinghao Wang; Zi Wang; Ziwen Wang; Banglei Guan; Qifeng Yu

DKPMV: Dense Keypoints Fusion from Multi-View RGB Frames for 6D Pose Estimation of Textureless Objects

Jiahong Chen, Jinghao Wang, Zi Wang, Ziwen Wang, Banglei Guan, Qifeng Yu

TL;DR

The paper tackles 6D pose estimation of textureless objects from multi-view RGB data without depth, addressing depth-reliability challenges. It introduces DKPMV, a dense keypoint fusion pipeline with a three-stage progressive optimization, symmetry-aware training, and attentional aggregation to leverage cross-view geometry. Experiments on the ROBI dataset show DKPMV achieves state-of-the-art performance among RGB methods and often surpasses RGB-D baselines, especially in challenging RealSense scenarios. The work promises practical impact for real-time robotics by enabling robust pose estimation under occlusions and symmetry without expensive depth sensors.

Abstract

6D pose estimation of textureless objects is valuable for industrial robotic applications, yet remains challenging due to the frequent loss of depth information. Current multi-view methods either rely on depth data or insufficiently exploit multi-view geometric cues, limiting their performance. In this paper, we propose DKPMV, a pipeline that achieves dense keypoint-level fusion using only multi-view RGB images as input. We design a three-stage progressive pose optimization strategy that leverages dense multi-view keypoint geometry information. To enable effective dense keypoint fusion, we enhance the keypoint network with attentional aggregation and symmetry-aware training, improving prediction accuracy and resolving ambiguities on symmetric objects. Extensive experiments on the ROBI dataset demonstrate that DKPMV outperforms state-of-the-art multi-view RGB approaches and even surpasses the RGB-D methods in the majority of cases. The code will be available soon.

DKPMV: Dense Keypoints Fusion from Multi-View RGB Frames for 6D Pose Estimation of Textureless Objects

TL;DR

Abstract

DKPMV: Dense Keypoints Fusion from Multi-View RGB Frames for 6D Pose Estimation of Textureless Objects

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (9)