Table of Contents
Fetching ...

Synthetic data enables faster annotation and robust segmentation for multi-object grasping in clutter

Dongmyoung Lee, Wei Chen, Nicolas Rojas

TL;DR

The paper tackles the data annotation bottleneck in multi-object grasping under clutter by introducing a hybrid synthetic–real training pipeline. It uses a WGAN-GP-based generator to create self-labeled fruit crops and composited scenes, enabling efficient instance segmentation with limited real data and translating to improved labelling and grasp success in real-world tasks. Results show that the Gen-hybrid approach outperforms real-only and CP-hybrid baselines, particularly when real data is scarce, and demonstrates robust grasping in clutter without relying on CAD models. This work points to a practical, data-efficient path for training perception systems for robotic manipulation in unstructured environments.

Abstract

Object recognition and object pose estimation in robotic grasping continue to be significant challenges, since building a labelled dataset can be time consuming and financially costly in terms of data collection and annotation. In this work, we propose a synthetic data generation method that minimizes human intervention and makes downstream image segmentation algorithms more robust by combining a generated synthetic dataset with a smaller real-world dataset (hybrid dataset). Annotation experiments show that the proposed synthetic scene generation can diminish labelling time dramatically. RGB image segmentation is trained with hybrid dataset and combined with depth information to produce pixel-to-point correspondence of individual segmented objects. The object to grasp is then determined by the confidence score of the segmentation algorithm. Pick-and-place experiments demonstrate that segmentation trained on our hybrid dataset (98.9%, 70%) outperforms the real dataset and a publicly available dataset by (6.7%, 18.8%) and (2.8%, 10%) in terms of labelling and grasping success rate, respectively. Supplementary material is available at https://sites.google.com/view/synthetic-dataset-generation.

Synthetic data enables faster annotation and robust segmentation for multi-object grasping in clutter

TL;DR

The paper tackles the data annotation bottleneck in multi-object grasping under clutter by introducing a hybrid synthetic–real training pipeline. It uses a WGAN-GP-based generator to create self-labeled fruit crops and composited scenes, enabling efficient instance segmentation with limited real data and translating to improved labelling and grasp success in real-world tasks. Results show that the Gen-hybrid approach outperforms real-only and CP-hybrid baselines, particularly when real data is scarce, and demonstrates robust grasping in clutter without relying on CAD models. This work points to a practical, data-efficient path for training perception systems for robotic manipulation in unstructured environments.

Abstract

Object recognition and object pose estimation in robotic grasping continue to be significant challenges, since building a labelled dataset can be time consuming and financially costly in terms of data collection and annotation. In this work, we propose a synthetic data generation method that minimizes human intervention and makes downstream image segmentation algorithms more robust by combining a generated synthetic dataset with a smaller real-world dataset (hybrid dataset). Annotation experiments show that the proposed synthetic scene generation can diminish labelling time dramatically. RGB image segmentation is trained with hybrid dataset and combined with depth information to produce pixel-to-point correspondence of individual segmented objects. The object to grasp is then determined by the confidence score of the segmentation algorithm. Pick-and-place experiments demonstrate that segmentation trained on our hybrid dataset (98.9%, 70%) outperforms the real dataset and a publicly available dataset by (6.7%, 18.8%) and (2.8%, 10%) in terms of labelling and grasping success rate, respectively. Supplementary material is available at https://sites.google.com/view/synthetic-dataset-generation.
Paper Structure (15 sections, 11 figures, 4 tables)

This paper contains 15 sections, 11 figures, 4 tables.

Figures (11)

  • Figure 1: Robot learns to grasp multiple objects in clutter and sort them into target boxes with the proposed instance segmentation algorithm.
  • Figure 2: The overall procedure for fruit grasping in clutter.
  • Figure 3: The network architecture of WGAN-GP algorithm.
  • Figure 4: Sample output of generated fruits using WGAN-GP algorithm.
  • Figure 5: Synthetic scene is generated by randomly pasting object-wise images into the background scenes. (A): Generated fruit images and segmented pixels representing the target object. (B): Synthetic scenes with these instances.
  • ...and 6 more figures