DemoGen: Synthetic Demonstration Generation for Data-Efficient Visuomotor Policy Learning
Zhengrong Xue, Shuying Deng, Zhenyang Chen, Yixuan Wang, Zhecheng Yuan, Huazhe Xu
TL;DR
DemoGen tackles the data inefficiency of visuomotor policy learning by generating fully synthetic, spatially augmented demonstrations from a single human example. It combines Task and Motion Planning-based action adaptation with 3D point-cloud observation synthesis to produce usable, varied demonstrations at negligible compute cost. Real-world and simulated experiments show significant improvements in spatial generalization across diverse tasks and platforms, with extensions enabling disturbance resistance and obstacle avoidance. The approach narrows the data-collection burden in robotic manipulation while preserving closed-loop control capabilities, though visual-mismatch and point-cloud segmentation limitations remain areas for further improvement.
Abstract
Visuomotor policies have shown great promise in robotic manipulation but often require substantial amounts of human-collected data for effective performance. A key reason underlying the data demands is their limited spatial generalization capability, which necessitates extensive data collection across different object configurations. In this work, we present DemoGen, a low-cost, fully synthetic approach for automatic demonstration generation. Using only one human-collected demonstration per task, DemoGen generates spatially augmented demonstrations by adapting the demonstrated action trajectory to novel object configurations. Visual observations are synthesized by leveraging 3D point clouds as the modality and rearranging the subjects in the scene via 3D editing. Empirically, DemoGen significantly enhances policy performance across a diverse range of real-world manipulation tasks, showing its applicability even in challenging scenarios involving deformable objects, dexterous hand end-effectors, and bimanual platforms. Furthermore, DemoGen can be extended to enable additional out-of-distribution capabilities, including disturbance resistance and obstacle avoidance.
