Any6D: Model-free 6D Pose Estimation of Novel Objects
Taeyeop Lee, Bowen Wen, Minjun Kang, Gyuree Kang, In So Kweon, Kuk-Jin Yoon
TL;DR
Any6D tackles the problem of 6D pose estimation for novel objects without relying on textured CAD models or multiview references. It introduces a two-stage object alignment pipeline that reconstructs a normalized shape from a single RGB-D anchor, then estimates metric-scale size and pose, followed by refinement and a render-and-compare strategy to predict the relative pose to a query image. On five real-world datasets, Any6D achieves state-of-the-art results across standard pose metrics, demonstrating strong generalization to unseen objects, occlusions, and cross-environment variations. This approach reduces dependence on detailed object data and facilitates robust manipulation and augmented reality applications in real-world settings.
Abstract
We introduce Any6D, a model-free framework for 6D object pose estimation that requires only a single RGB-D anchor image to estimate both the 6D pose and size of unknown objects in novel scenes. Unlike existing methods that rely on textured 3D models or multiple viewpoints, Any6D leverages a joint object alignment process to enhance 2D-3D alignment and metric scale estimation for improved pose accuracy. Our approach integrates a render-and-compare strategy to generate and refine pose hypotheses, enabling robust performance in scenarios with occlusions, non-overlapping views, diverse lighting conditions, and large cross-environment variations. We evaluate our method on five challenging datasets: REAL275, Toyota-Light, HO3D, YCBINEOAT, and LM-O, demonstrating its effectiveness in significantly outperforming state-of-the-art methods for novel object pose estimation. Project page: https://taeyeop.com/any6d
