Continuous Control of Diverse Skills in Quadruped Robots Without Complete Expert Datasets
Jiaxin Tu, Xiaoyi Wei, Yueqi Zhang, Taixian Hou, Xiaofei Gao, Zhiyan Dong, Peng Zhai, Lihua Zhang
TL;DR
The paper tackles the challenge of learning diverse quadruped skills and smooth transitions without relying on complete expert datasets. It introduces PASIST, a framework that uses introspective learning with Generative Adversarial Self-Imitation Learning to autonomously discover high-quality trajectories guided by target poses and a DTW-based trajectory quality metric. A skill selector mitigates mode collapse and balances learning across skills, enabling smooth transitions between behaviors. Experiments on simulation and a real Solo 8 robot demonstrate effective multi-skill acquisition and zero-shot sim-to-real transfer, offering an efficient alternative to expert-driven imitation learning.
Abstract
Learning diverse skills for quadruped robots presents significant challenges, such as mastering complex transitions between different skills and handling tasks of varying difficulty. Existing imitation learning methods, while successful, rely on expensive datasets to reproduce expert behaviors. Inspired by introspective learning, we propose Progressive Adversarial Self-Imitation Skill Transition (PASIST), a novel method that eliminates the need for complete expert datasets. PASIST autonomously explores and selects high-quality trajectories based on predefined target poses instead of demonstrations, leveraging the Generative Adversarial Self-Imitation Learning (GASIL) framework. To further enhance learning, We develop a skill selection module to mitigate mode collapse by balancing the weights of skills with varying levels of difficulty. Through these methods, PASIST is able to reproduce skills corresponding to the target pose while achieving smooth and natural transitions between them. Evaluations on both simulation platforms and the Solo 8 robot confirm the effectiveness of PASIST, offering an efficient alternative to expert-driven learning.
