Close the Sim2real Gap via Physically-based Structured Light Synthetic Data Simulation
Kaixin Bai, Lei Zhang, Zhaopeng Chen, Fang Wan, Jianwei Zhang
TL;DR
This paper tackles the data bottleneck in industrial robotic perception by introducing a physically-based structured-light synthesis pipeline that generates photorealistic RGBD data with ground-truth annotations. Using Blender Cycles and ray-traced gray-code projection, it produces realistic depth with structured-light noise, enabling effective sim2real transfer for object detection, instance segmentation, and robotic grasping. The authors demonstrate that depth-based inputs and domain-adaptation strategies reduce the sim2real gap, improving real-world performance and reducing development time compared to domain-randomized approaches. The work offers a practical, scalable tool for industrial DL deployment and points to future expansion of the dataset and optimization of pose estimation and grasping algorithms.
Abstract
Despite the substantial progress in deep learning, its adoption in industrial robotics projects remains limited, primarily due to challenges in data acquisition and labeling. Previous sim2real approaches using domain randomization require extensive scene and model optimization. To address these issues, we introduce an innovative physically-based structured light simulation system, generating both RGB and physically realistic depth images, surpassing previous dataset generation tools. We create an RGBD dataset tailored for robotic industrial grasping scenarios and evaluate it across various tasks, including object detection, instance segmentation, and embedding sim2real visual perception in industrial robotic grasping. By reducing the sim2real gap and enhancing deep learning training, we facilitate the application of deep learning models in industrial settings. Project details are available at https://baikaixinpublic.github.io/structured light 3D synthesizer/.
