COBRA -- COnfidence score Based on shape Regression Analysis for method-independent quality assessment of object pose estimation from single images
Panagiotis Sapoutzoglou, Georgios Giapitzakis, Georgios Floros, George Terzakis, Maria Pateraki
TL;DR
COBRA addresses the lack of a method-agnostic runtime confidence measure for 6D pose estimation from single images by constructing a lightweight, GP-based directional distance field template learned from sparse interior reference points. Pose quality is scored by comparing back-projected image points against this template via a mixture of GP priors, yielding a probabilistic confidence that correlates with traditional pose accuracy metrics such as ADD. The approach is validated on ShapeNetCore and IndustryShapes, showing strong shape-representation fidelity (low Chamfer distance and high F-score) and a robust negative correlation between ADD and COBRA confidence, with kernel choice and the number of reference points identified as key design factors. These results demonstrate a practical, interpretable, and method-agnostic tool for assessing pose estimates in robotics and vision applications, including real-world industrial scenarios. The work also highlights limitations around reference-point placement and suggests future work on automated coverage guarantees and fully automated template construction.
Abstract
We propose a generic procedure for assessing 6D object pose estimates. Our approach relies on the evaluation of discrepancies in the geometry of the observed object, in particular its respective estimated back-projection in 3D, against a putative functional shape representation comprising mixtures of Gaussian Processes, that act as a template. Each Gaussian Process is trained to yield a fragment of the object's surface in a radial fashion with respect to designated reference points. We further define a pose confidence measure as the average probability of pixel back-projections in the Gaussian mixture. The goal of our experiments is two-fold. a) We demonstrate that our functional representation is sufficiently accurate as a shape template on which the probability of back-projected object points can be evaluated, and, b) we show that the resulting confidence scores based on these probabilities are indeed a consistent quality measure of pose.
