EfficientPose: An efficient, accurate and scalable end-to-end 6D multi object pose estimation approach

Yannick Bukschat; Marcus Vetter

EfficientPose: An efficient, accurate and scalable end-to-end 6D multi object pose estimation approach

Yannick Bukschat, Marcus Vetter

TL;DR

EfficientPose tackles real-time 6D pose estimation for multiple objects in RGB images by extending EfficientDet with rotation and translation subnetworks and introducing a 6D augmentation to improve generalization. It enables end-to-end, single-shot multi-object pose estimation without per-object PnP or RANSAC post-processing, while maintaining scalability via a single hyperparameter $\phi$. On Linemod, it achieves state-of-the-art ADD(-S) alongside real-time performance (over 27 FPS), and demonstrates robust multi-object capability on Occlusion. This work narrows the gap between direct 6D pose estimation and 2D+PnP pipelines, delivering practical impact for robotics, autonomous systems, and augmented reality.

Abstract

In this paper we introduce EfficientPose, a new approach for 6D object pose estimation. Our method is highly accurate, efficient and scalable over a wide range of computational resources. Moreover, it can detect the 2D bounding box of multiple objects and instances as well as estimate their full 6D poses in a single shot. This eliminates the significant increase in runtime when dealing with multiple objects other approaches suffer from. These approaches aim to first detect 2D targets, e.g. keypoints, and solve a Perspective-n-Point problem for their 6D pose for each object afterwards. We also propose a novel augmentation method for direct 6D pose estimation approaches to improve performance and generalization, called 6D augmentation. Our approach achieves a new state-of-the-art accuracy of 97.35% in terms of the ADD(-S) metric on the widely-used 6D pose estimation benchmark dataset Linemod using RGB input, while still running end-to-end at over 27 FPS. Through the inherent handling of multiple objects and instances and the fused single shot 2D object detection as well as 6D pose estimation, our approach runs even with multiple objects (eight) end-to-end at over 26 FPS, making it highly attractive to many real world scenarios. Code will be made publicly available at https://github.com/ybkscht/EfficientPose.

EfficientPose: An efficient, accurate and scalable end-to-end 6D multi object pose estimation approach

TL;DR

Abstract

EfficientPose: An efficient, accurate and scalable end-to-end 6D multi object pose estimation approach

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (9)