Learning Novel Skills from Language-Generated Demonstrations

Ao-Qun Jin; Tian-Yu Xiang; Xiao-Hu Zhou; Mei-Jiang Gui; Xiao-Liang Xie; Shi-Qi Liu; Shuang-Yi Wang; Yue Cao; Sheng-Bin Duan; Fu-Chao Xie; Zeng-Guang Hou

Learning Novel Skills from Language-Generated Demonstrations

Ao-Qun Jin, Tian-Yu Xiang, Xiao-Hu Zhou, Mei-Jiang Gui, Xiao-Liang Xie, Shi-Qi Liu, Shuang-Yi Wang, Yue Cao, Sheng-Bin Duan, Fu-Chao Xie, Zeng-Guang Hou

TL;DR

DemoGen tackles the data and safety bottlenecks of learning novel robotic skills by turning natural language task descriptions into demonstration videos via a vision‑language model and a diffusion‑based video generator. It then extracts state–action pairs with an inverse dynamics model and learns policies through imitation learning, enabling both zero‑shot and few‑shot skill acquisition in simulation. In MetaWorld, generated demonstrations achieve high fidelity and, for novel tasks, drive approximately a threefold improvement in task accomplishment over baselines. This framework reduces data collection costs and safety risks while providing a scalable, language‑driven path toward broad robotic skill acquisition with potential real‑world validation.

Abstract

Robots are increasingly deployed across diverse domains to tackle tasks requiring novel skills. However, current robot learning algorithms for acquiring novel skills often rely on demonstration datasets or environment interactions, resulting in high labor costs and potential safety risks. To address these challenges, this study proposes DemoGen, a skill-learning framework that enables robots to acquire novel skills from natural language instructions. DemoGen leverages the vision-language model and the video diffusion model to generate demonstration videos of novel skills, which enabling robots to learn new skills effectively. Experimental evaluations in the MetaWorld simulation environments demonstrate the pipeline's capability to generate high-fidelity and reliable demonstrations. Using the generated demonstrations, various skill learning algorithms achieve an accomplishment rate three times the original on novel tasks. These results highlight a novel approach to robot learning, offering a foundation for the intuitive and intelligent acquisition of novel robotic skills. (Project website: https://aoqunjin.github.io/LNSLGD/)

Learning Novel Skills from Language-Generated Demonstrations

TL;DR

Abstract

Learning Novel Skills from Language-Generated Demonstrations

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (6)