Table of Contents
Fetching ...

Improving 6D Object Pose Estimation of metallic Household and Industry Objects

Thomas Pöllabauer, Michael Gasser, Tristan Wirth, Sarah Berkei, Volker Knauthe, Arjan Kuijper

TL;DR

This work addresses the reduced accuracy of 6D pose estimation for metallic objects caused by reflections and specular highlights. It introduces a new BOP-compatible metallic dataset rendered with physically-based rendering to mirror industrial lighting and backgrounds, and extends the GDRNPP framework with two new heads: keypoint heatmap prediction and material properties estimation. The key contributions also include leveraging Bottleneck Attention Modules to fuse geometric and appearance cues, and demonstrating substantial performance gains on metallic objects across standard 6D pose metrics. The findings show that explicit geometric keypoints and material-aware predictions can significantly improve pose estimation in challenging metallic scenarios, advancing applicability in robotics and automation; the dataset is publicly available for further research.

Abstract

6D object pose estimation suffers from reduced accuracy when applied to metallic objects. We set out to improve the state-of-the-art by addressing challenges such as reflections and specular highlights in industrial applications. Our novel BOP-compatible dataset, featuring a diverse set of metallic objects (cans, household, and industrial items) under various lighting and background conditions, provides additional geometric and visual cues. We demonstrate that these cues can be effectively leveraged to enhance overall performance. To illustrate the usefulness of the additional features, we improve upon the GDRNPP algorithm by introducing an additional keypoint prediction and material estimator head in order to improve spatial scene understanding. Evaluations on the new dataset show improved accuracy for metallic objects, supporting the hypothesis that additional geometric and visual cues can improve learning.

Improving 6D Object Pose Estimation of metallic Household and Industry Objects

TL;DR

This work addresses the reduced accuracy of 6D pose estimation for metallic objects caused by reflections and specular highlights. It introduces a new BOP-compatible metallic dataset rendered with physically-based rendering to mirror industrial lighting and backgrounds, and extends the GDRNPP framework with two new heads: keypoint heatmap prediction and material properties estimation. The key contributions also include leveraging Bottleneck Attention Modules to fuse geometric and appearance cues, and demonstrating substantial performance gains on metallic objects across standard 6D pose metrics. The findings show that explicit geometric keypoints and material-aware predictions can significantly improve pose estimation in challenging metallic scenarios, advancing applicability in robotics and automation; the dataset is publicly available for further research.

Abstract

6D object pose estimation suffers from reduced accuracy when applied to metallic objects. We set out to improve the state-of-the-art by addressing challenges such as reflections and specular highlights in industrial applications. Our novel BOP-compatible dataset, featuring a diverse set of metallic objects (cans, household, and industrial items) under various lighting and background conditions, provides additional geometric and visual cues. We demonstrate that these cues can be effectively leveraged to enhance overall performance. To illustrate the usefulness of the additional features, we improve upon the GDRNPP algorithm by introducing an additional keypoint prediction and material estimator head in order to improve spatial scene understanding. Evaluations on the new dataset show improved accuracy for metallic objects, supporting the hypothesis that additional geometric and visual cues can improve learning.

Paper Structure

This paper contains 14 sections, 2 equations, 6 figures, 2 tables.

Figures (6)

  • Figure 1: Metallic objects from households and industry that are used in our novel 6D pose estimation dataset.
  • Figure 2: Example Scenes from our dataset representing the five lighting scenarios utilizing built-in blender light sources: (a) Ambient light with one point light source, (b) one point light source, (c) only ambient illumination, (d) ambient illumination with a spot light source, and (e) multiple spot light sources.
  • Figure 3: Images from our dataset (a) without additional modalities, (b) with additional occluding objects from the ITODD dataset, and (c) with reflective surfaces.
  • Figure 4: Extension of GDRNPP to predict additional geometric keypoints and object material information. Two additional heads in the decoder predict either a heatmap, encoding the projected keypoint location or a pixel-wise difference image, representing the necessary change to the input crop, to simulate a non-metallic surface.
  • Figure 5: Given a view of the target object, we project object points into the view, remove hidden points, identify points with relevant surface information as keypoints, and finally derive a continous signal by generating a heatmap.
  • ...and 1 more figures