Table of Contents
Fetching ...

SkillMimicGen: Automated Demonstration Generation for Efficient Skill Learning and Deployment

Caelan Garrett, Ajay Mandlekar, Bowen Wen, Dieter Fox

TL;DR

SkillGen addresses the data bottleneck in imitation learning for manipulation by automatically generating large demonstration datasets from a few human examples through skill segmentation and motion-planned transitions. It introduces Hybrid Skill Policies that initiate, control, and terminate local skills, enabling seamless sequencing with a motion planner at test time. The approach yields substantially higher data-generation throughput and policy performance than prior methods, demonstrates cross-robot transfer and real-world applicability, and even achieves zero-shot sim-to-real transfer on long-horizon tasks. This work significantly reduces human effort while maintaining high task proficiency, advancing scalable, robust skill learning for robotic manipulation.

Abstract

Imitation learning from human demonstrations is an effective paradigm for robot manipulation, but acquiring large datasets is costly and resource-intensive, especially for long-horizon tasks. To address this issue, we propose SkillMimicGen (SkillGen), an automated system for generating demonstration datasets from a few human demos. SkillGen segments human demos into manipulation skills, adapts these skills to new contexts, and stitches them together through free-space transit and transfer motion. We also propose a Hybrid Skill Policy (HSP) framework for learning skill initiation, control, and termination components from SkillGen datasets, enabling skills to be sequenced using motion planning at test-time. We demonstrate that SkillGen greatly improves data generation and policy learning performance over a state-of-the-art data generation framework, resulting in the capability to produce data for large scene variations, including clutter, and agents that are on average 24% more successful. We demonstrate the efficacy of SkillGen by generating over 24K demonstrations across 18 task variants in simulation from just 60 human demonstrations, and training proficient, often near-perfect, HSP agents. Finally, we apply SkillGen to 3 real-world manipulation tasks and also demonstrate zero-shot sim-to-real transfer on a long-horizon assembly task. Videos, and more at https://skillgen.github.io.

SkillMimicGen: Automated Demonstration Generation for Efficient Skill Learning and Deployment

TL;DR

SkillGen addresses the data bottleneck in imitation learning for manipulation by automatically generating large demonstration datasets from a few human examples through skill segmentation and motion-planned transitions. It introduces Hybrid Skill Policies that initiate, control, and terminate local skills, enabling seamless sequencing with a motion planner at test time. The approach yields substantially higher data-generation throughput and policy performance than prior methods, demonstrates cross-robot transfer and real-world applicability, and even achieves zero-shot sim-to-real transfer on long-horizon tasks. This work significantly reduces human effort while maintaining high task proficiency, advancing scalable, robust skill learning for robotic manipulation.

Abstract

Imitation learning from human demonstrations is an effective paradigm for robot manipulation, but acquiring large datasets is costly and resource-intensive, especially for long-horizon tasks. To address this issue, we propose SkillMimicGen (SkillGen), an automated system for generating demonstration datasets from a few human demos. SkillGen segments human demos into manipulation skills, adapts these skills to new contexts, and stitches them together through free-space transit and transfer motion. We also propose a Hybrid Skill Policy (HSP) framework for learning skill initiation, control, and termination components from SkillGen datasets, enabling skills to be sequenced using motion planning at test-time. We demonstrate that SkillGen greatly improves data generation and policy learning performance over a state-of-the-art data generation framework, resulting in the capability to produce data for large scene variations, including clutter, and agents that are on average 24% more successful. We demonstrate the efficacy of SkillGen by generating over 24K demonstrations across 18 task variants in simulation from just 60 human demonstrations, and training proficient, often near-perfect, HSP agents. Finally, we apply SkillGen to 3 real-world manipulation tasks and also demonstrate zero-shot sim-to-real transfer on a long-horizon assembly task. Videos, and more at https://skillgen.github.io.

Paper Structure

This paper contains 52 sections, 12 figures, 14 tables, 2 algorithms.

Figures (12)

  • Figure 1: SkillGen Overview. SkillGen trains proficient agents with minimal human effort. ( left) First, a human teleoperator first collects $\sim 3$ demonstrations of the task and annotates the start and end of the skill segments, where each object interaction happens. ( middle) Then, SkillGen automatically adapts these local skill demonstrations to new scenes and connects them through motion planning to amplify the number of successful demonstrations. ( right) These demonstrations are used to train Hybrid Skill Policies (HSP), agents that alternate between closed-loop reactive skills and coarse transit motions carried out by motion planning.
  • Figure 2: HSP Deployment. At test-time SkillGen, executes several learned skills in sequence, using motion planning to connect the termination state of the last skill with an initiation state of the next skill. Each skill consists of the initiation condition $I_{\theta}$, the closed-loop controller $\pi_{\theta}$, and the termination condition $T_{\theta}$.
  • Figure 3: Tasks. We deploy SkillGen on 6 simulation tasks (18 task variants, see Appendix \ref{['app:tasks']}) (a-f) and 4 real-world tasks (g-j). These tasks involve fine-grained insertion (a-d), composing several manipulation behaviors together (e, f), real-world data generation and training (g-i) and zero-shot sim-to-real policy transfer (j).
  • Figure 4: ( left) Agent Performance on SkillGen Datasets. Success rates of agents trained on source demonstrations (with HSP-TAMP), MimicGen mandlekar2023mimicgen data (with BC-RNN robomimic2021), and SkillGen data (with all HSP variants). SkillGen data greatly improves agent performance on $D_0$ compared to the source data, and SkillGen agents substantially outperform MimicGen agents, especially on more challenging task variants. ( upper right) Training Data Comparison. HSP-TAMP agent performance is comparable on 200 SkillGen demos and 200 human demos, despite SkillGen using just 10 human demos for generation. Generating more SkillGen demonstrations can result in significant performance improvement (also see Appendix \ref{['app:scaling']}). ( lower right) Real-World Manipulation Results. HSP-Class agents trained on SkillGen data generated in the real world are proficient, and substantially outperform using MimicGen data. They can also be transferred zero-shot from sim-to-real.
  • Figure D.1: Example Configurations for Clutter Tasks. Example configurations from the clutter task variants of Square and Coffee.
  • ...and 7 more figures