Object-Pose Estimation With Neural Population Codes
Heiko Hoffmann, Richard Hoffmann
TL;DR
This work addresses object pose estimation under symmetry-induced rotational ambiguity by introducing a neural population code for orientation. The approach encodes rotation as a population activation pattern on a sphere×circle, enabling direct end-to-end learning and robust handling of symmetry. On the T-LESS dataset, it achieves fast edge-device inference ($3.2$ ms) and superior symmetry-aware rotation accuracy (MSSD) of $84.7\%$ compared with baselines, while requiring grayscale input only. The population-code representation captures pose ambiguity through multiple activation peaks and demonstrates practical potential for real-time robotic assembly and potential extensions to direct grasp-posture control.
Abstract
Robotic assembly tasks require object-pose estimation, particularly for tasks that avoid costly mechanical constraints. Object symmetry complicates the direct mapping of sensory input to object rotation, as the rotation becomes ambiguous and lacks a unique training target. Some proposed solutions involve evaluating multiple pose hypotheses against the input or predicting a probability distribution, but these approaches suffer from significant computational overhead. Here, we show that representing object rotation with a neural population code overcomes these limitations, enabling a direct mapping to rotation and end-to-end learning. As a result, population codes facilitate fast and accurate pose estimation. On the T-LESS dataset, we achieve inference in 3.2 milliseconds on an Apple M1 CPU and a Maximum Symmetry-Aware Surface Distance accuracy of 84.7% using only gray-scale image input, compared to 69.7% accuracy when directly mapping to pose.
