Sparse Color-Code Net: Real-Time RGB-Based 6D Object Pose Estimation on Edge Devices
Xingjian Yang, Zhitao Yu, Ashis G. Banerjee
TL;DR
This work addresses real-time, RGB-based 6D object pose estimation on edge devices by introducing SCCN, a three-stage pipeline that leverages Sobel-contour features, sparse color-code regression, and a symmetry-aware representation to robustly handle occlusion and object symmetry. A key contribution is the anisotropic color-code and a novel per-pixel symmetry mask, enabling efficient, accurate 2D–3D correspondences followed by a PnP solver with a sparsified point set. The approach achieves real-time performance on NVIDIA Jetson Xavier (approximately 19 FPS for a single object and 6 FPS for multiple objects) with competitive accuracy, and ablation studies show the effectiveness of the anisotropic and symmetry components while preserving speed. The results demonstrate practical feasibility for mobile manipulation and AR, and the authors outline future work toward multi-instance poses, improved generalization, and integration with recognition and probabilistic mapping systems.
Abstract
As robotics and augmented reality applications increasingly rely on precise and efficient 6D object pose estimation, real-time performance on edge devices is required for more interactive and responsive systems. Our proposed Sparse Color-Code Net (SCCN) embodies a clear and concise pipeline design to effectively address this requirement. SCCN performs pixel-level predictions on the target object in the RGB image, utilizing the sparsity of essential object geometry features to speed up the Perspective-n-Point (PnP) computation process. Additionally, it introduces a novel pixel-level geometry-based object symmetry representation that seamlessly integrates with the initial pose predictions, effectively addressing symmetric object ambiguities. SCCN notably achieves an estimation rate of 19 frames per second (FPS) and 6 FPS on the benchmark LINEMOD dataset and the Occlusion LINEMOD dataset, respectively, for an NVIDIA Jetson AGX Xavier, while consistently maintaining high estimation accuracy at these rates.
