Table of Contents
Fetching ...

Synthetic Data Generation for Bridging Sim2Real Gap in a Production Environment

Parth Rawal, Mrunal Sompura, Wolfgang Hintze

TL;DR

This paper tackles the gap between synthetic and real data (Sim2Real) in production-focused computer vision by introducing a scalable pipeline that leverages CAD models to generate labeled synthetic data. It automates CAD-to-mesh export via FreeCAD, builds a modular data-generation workflow on BlenderProc with five basic procedures, and demonstrates how combining procedures yields improvements over single-procedure training in real production environments. Ground-truth visualization and a rigorous validation using YOLOv7 on real scenes show up to 15% improvements when using carefully crafted procedure combinations, emphasizing the importance of target-domain knowledge. The approach promises scalable generation of thousands of annotated images for object detection, segmentation, and 6D pose estimation, reducing labeling effort and enhancing robot-assisted production with AI.

Abstract

Synthetic data is being used lately for training deep neural networks in computer vision applications such as object detection, object segmentation and 6D object pose estimation. Domain randomization hereby plays an important role in reducing the simulation to reality gap. However, this generalization might not be effective in specialized domains like a production environment involving complex assemblies. Either the individual parts, trained with synthetic images, are integrated in much larger assemblies making them indistinguishable from their counterparts and result in false positives or are partially occluded just enough to give rise to false negatives. Domain knowledge is vital in these cases and if conceived effectively while generating synthetic data, can show a considerable improvement in bridging the simulation to reality gap. This paper focuses on synthetic data generation procedures for parts and assemblies used in a production environment. The basic procedures for synthetic data generation and their various combinations are evaluated and compared on images captured in a production environment, where results show up to 15% improvement using combinations of basic procedures. Reducing the simulation to reality gap in this way can aid to utilize the true potential of robot assisted production using artificial intelligence.

Synthetic Data Generation for Bridging Sim2Real Gap in a Production Environment

TL;DR

This paper tackles the gap between synthetic and real data (Sim2Real) in production-focused computer vision by introducing a scalable pipeline that leverages CAD models to generate labeled synthetic data. It automates CAD-to-mesh export via FreeCAD, builds a modular data-generation workflow on BlenderProc with five basic procedures, and demonstrates how combining procedures yields improvements over single-procedure training in real production environments. Ground-truth visualization and a rigorous validation using YOLOv7 on real scenes show up to 15% improvements when using carefully crafted procedure combinations, emphasizing the importance of target-domain knowledge. The approach promises scalable generation of thousands of annotated images for object detection, segmentation, and 6D pose estimation, reducing labeling effort and enhancing robot-assisted production with AI.

Abstract

Synthetic data is being used lately for training deep neural networks in computer vision applications such as object detection, object segmentation and 6D object pose estimation. Domain randomization hereby plays an important role in reducing the simulation to reality gap. However, this generalization might not be effective in specialized domains like a production environment involving complex assemblies. Either the individual parts, trained with synthetic images, are integrated in much larger assemblies making them indistinguishable from their counterparts and result in false positives or are partially occluded just enough to give rise to false negatives. Domain knowledge is vital in these cases and if conceived effectively while generating synthetic data, can show a considerable improvement in bridging the simulation to reality gap. This paper focuses on synthetic data generation procedures for parts and assemblies used in a production environment. The basic procedures for synthetic data generation and their various combinations are evaluated and compared on images captured in a production environment, where results show up to 15% improvement using combinations of basic procedures. Reducing the simulation to reality gap in this way can aid to utilize the true potential of robot assisted production using artificial intelligence.
Paper Structure (12 sections, 9 figures, 7 tables)

This paper contains 12 sections, 9 figures, 7 tables.

Figures (9)

  • Figure 1: Synthetic images generated with five different domain randomization procedures
  • Figure 2: Scalable pipeline concept from CAD model to trained model with interfaces inbetween
  • Figure 3: Demonstrator used for validation of synthetic data in real environment
  • Figure 4: Multiple steps for exporting part meshes and their transformations in assembly origin
  • Figure 5: Five basic data generation procedures explained in top-down approach
  • ...and 4 more figures