Shape-biased Texture Agnostic Representations for Improved Textureless and Metallic Object Detection and 6D Pose Estimation

Peter Hönig; Stefan Thalhammer; Jean-Baptiste Weibel; Matthias Hirschmanner; Markus Vincze

Shape-biased Texture Agnostic Representations for Improved Textureless and Metallic Object Detection and 6D Pose Estimation

Peter Hönig, Stefan Thalhammer, Jean-Baptiste Weibel, Matthias Hirschmanner, Markus Vincze

TL;DR

This work tackles detection and 6D pose estimation for textureless and metallic objects by inducing a shape bias through randomized UV-mapped textures during synthetic data rendering, achieving geometry-focused representations with negligible overhead. The proposed approach, implemented in a BlenderProc-based pipeline with $n=1226$ textures, outperforms texture-based baselines and a style-transfer baseline across three detectors and two pose estimators, especially under varying illumination and noise. It also demonstrates improved robustness to common perturbations and provides insight via ablations on texture count and mesh origin. The method offers a practical route to reduce reliance on online augmentations and enhances generalization to textureless/metallic materials in real-world robotics scenarios.

Abstract

Recent advances in machine learning have greatly benefited object detection and 6D pose estimation. However, textureless and metallic objects still pose a significant challenge due to few visual cues and the texture bias of CNNs. To address his issue, we propose a strategy for inducing a shape bias to CNN training. In particular, by randomizing textures applied to object surfaces during data rendering, we create training data without consistent textural cues. This methodology allows for seamless integration into existing data rendering engines, and results in negligible computational overhead for data rendering and network training. Our findings demonstrate that the shape bias we induce via randomized texturing, improves over existing approaches using style transfer. We evaluate with three detectors and two pose estimators. For the most recent object detector and for pose estimation in general, estimation accuracy improves for textureless and metallic objects. Additionally we show that our approach increases the pose estimation accuracy in the presence of image noise and strong illumination changes. Code and datasets are publicly available at github.com/hoenigpeter/randomized_texturing.

Shape-biased Texture Agnostic Representations for Improved Textureless and Metallic Object Detection and 6D Pose Estimation

TL;DR

textures, outperforms texture-based baselines and a style-transfer baseline across three detectors and two pose estimators, especially under varying illumination and noise. It also demonstrates improved robustness to common perturbations and provides insight via ablations on texture count and mesh origin. The method offers a practical route to reduce reliance on online augmentations and enhances generalization to textureless/metallic materials in real-world robotics scenarios.

Abstract

Paper Structure (20 sections, 7 figures, 6 tables)

This paper contains 20 sections, 7 figures, 6 tables.

Introduction
Related Work
2D Object Detection and 6D Pose Estimation
Data Representation
Induction of Shape Bias
Data Rendering for Inducing a Shape Bias
Experiments
Experimental Setup
Methods
Datasets
Metrics
UV-Mapping versus Style Transfer
Object Detection
Object Pose Estimation
Robustness to Image Perturbations
...and 5 more sections

Figures (7)

Figure 1: Induction of Shape Bias. 6D pose of a textureless object is visualized with 3D bounding boxes; the ground truth (blue) and estimate (green) of GDR-Net trained with conventional data (top) and when inducing a shape bias (bottom).
Figure 2: Synthetic Data Generation with Randomized Texturing Pipeline. Instead of sampling color values from a uniform color distribution we download textures from a creative-commons database and perform low-cost UV-mapping, a method that we call texture randomization. While the figure shows the scene composition and rendering steps with five exemplary textures, the texture randomization method can be used with an arbitrary number of texture files.
Figure 3: Detection Example. Examplary test images from TLESS and ITODD, with ground truth (blue) and predicted (green) bounding boxes using YOLOx. YOLOx yields more accurate bounding boxes, better recall and precision rates, when inducing a shape bias; detection and IoU thresholds are both set to $0.5$.
Figure 4: Pose Estimation Examples. Example images from the TLESS and ITODD datasets, showcasing the increased performance for occluded, dark and reflective object pose estimation; GDR-Net for pose estimation and YOLOx without color prior for detection; no detection ground truth for ITODD available.
Figure 5: Number of Textures. Ablation on the influence of the number of random textures for YOLOx on TLESS
...and 2 more figures

Shape-biased Texture Agnostic Representations for Improved Textureless and Metallic Object Detection and 6D Pose Estimation

TL;DR

Abstract

Shape-biased Texture Agnostic Representations for Improved Textureless and Metallic Object Detection and 6D Pose Estimation

Authors

TL;DR

Abstract

Table of Contents

Figures (7)