SpyroPose: SE(3) Pyramids for Object Pose Distribution Estimation
Rasmus Laurvig Haugaard, Frederik Hagelskjær, Thorbjørn Mosekjær Iversen
TL;DR
This paper tackles the problem of estimating pose distributions over SE(3) to capture visual ambiguities in object pose estimation. It introduces SpyroPose, which builds an SE(3) pyramid—a hierarchical grid combining SO(3) rotations and 3D translations—trained with a contrastive InfoNCE objective and enhanced by importance sampling for efficient learning. The method uses keypoint-based feature extraction within a UNet+ResNet backbone to produce location-aware embeddings, enabling real-time inference through sparse evaluation of the pyramid and a coarse-to-fine refinement strategy. Empirically, SpyroPose achieves state-of-the-art rotation distribution estimates on SYMSOL and TLESS, provides the first quantitative SE(3) distribution results on TLESS/HB, and demonstrates a powerful multi-view fusion capability that substantially increases the likelihood of the true pose. The work lays the groundwork for probabilistic, uncertainty-aware perception in robotics and opens avenues for principled sensor fusion using pose distributions.
Abstract
Object pose estimation is a core computer vision problem and often an essential component in robotics. Pose estimation is usually approached by seeking the single best estimate of an object's pose, but this approach is ill-suited for tasks involving visual ambiguity. In such cases it is desirable to estimate the uncertainty as a pose distribution to allow downstream tasks to make informed decisions. Pose distributions can have arbitrary complexity which motivates estimating unparameterized distributions, however, until now they have only been used for orientation estimation on SO(3) due to the difficulty in training on and normalizing over SE(3). We propose a novel method for pose distribution estimation on SE(3). We use a hierarchical grid, a pyramid, which enables efficient importance sampling during training and sparse evaluation of the pyramid at inference, allowing real time 6D pose distribution estimation. Our method outperforms state-of-the-art methods on SO(3), and to the best of our knowledge, we provide the first quantitative results on pose distribution estimation on SE(3). Code will be available at spyropose.github.io
