Resolving Symmetry Ambiguity in Correspondence-based Methods for Instance-level Object Pose Estimation
Yongliang Lin, Yongzhi Su, Sandeep Inuganti, Yan Di, Naeem Ajilforoushan, Hanqing Yang, Yu Zhang, Jason Rambach
TL;DR
The paper tackles symmetry-induced ambiguity in RGB-only 6D pose estimation by shifting from traditional one-to-one 2D-3D correspondences to one-to-many correspondences. It introduces SymCode, a symmetry-aware binary surface encoding, and SymNet, an end-to-end network that regresses the pose $(\mathbf{R}, \mathbf{t})$ without PnP or RANSAC, leveraging a CPR module and symmetry-aware losses. The approach is evaluated on highly symmetric datasets (T-LESS and IC-BIN), showing faster inference and competitive accuracy relative to state-of-the-art baselines, with ablations validating the benefits of end-to-end regression, code length, and the one-to-many formulation. This work offers a practical GPU-friendly solution for robust, real-time pose estimation of symmetric and near-symmetric objects using RGB data, with potential applicability to broader symmetry-rich perception tasks.
Abstract
Estimating the 6D pose of an object from a single RGB image is a critical task that becomes additionally challenging when dealing with symmetric objects. Recent approaches typically establish one-to-one correspondences between image pixels and 3D object surface vertices. However, the utilization of one-to-one correspondences introduces ambiguity for symmetric objects. To address this, we propose SymCode, a symmetry-aware surface encoding that encodes the object surface vertices based on one-to-many correspondences, eliminating the problem of one-to-one correspondence ambiguity. We also introduce SymNet, a fast end-to-end network that directly regresses the 6D pose parameters without solving a PnP problem. We demonstrate faster runtime and comparable accuracy achieved by our method on the T-LESS and IC-BIN benchmarks of mostly symmetric objects. Our source code will be released upon acceptance.
