Generalized Animal Imitator: Agile Locomotion with Versatile Motion Prior

Ruihan Yang; Zhuoqun Chen; Jianhan Ma; Chongyi Zheng; Yiyu Chen; Quan Nguyen; Xiaolong Wang

Generalized Animal Imitator: Agile Locomotion with Versatile Motion Prior

Ruihan Yang, Zhuoqun Chen, Jianhan Ma, Chongyi Zheng, Yiyu Chen, Quan Nguyen, Xiaolong Wang

TL;DR

This work introduces Versatile Instructable Motion prior (VIM), a reinforcement learning framework that learns a single, reusable motion prior from a diverse set of reference motions (mocap, synthetic, and trajectory-optimized) to enable multiple agile locomotion skills in legged robots. A latent command space, learned via a reference motion encoder and autoregressive KL regularization, guides a low-level policy trained with a combination of Functionality and Style rewards, including an adversarial stylization component. The approach yields spatial-temporal skill representations that support smooth transitions and enable high-level policies to solve downstream tasks such as following commands and jumping, demonstrated in both simulation and real hardware with superior performance and sample efficiency over baselines. VIM achieves realistic, agile behaviors like backflips and jumps without fine-tuning, illustrating strong sim2real transfer and potential for broad robotic applications. The work also discusses limitations and avenues for safety, perception, and dynamics-aware extensions to further enhance robustness and applicability.

Abstract

The agility of animals, particularly in complex activities such as running, turning, jumping, and backflipping, stands as an exemplar for robotic system design. Transferring this suite of behaviors to legged robotic systems introduces essential inquiries: How can a robot learn multiple locomotion behaviors simultaneously? How can the robot execute these tasks with a smooth transition? How to integrate these skills for wide-range applications? This paper introduces the Versatile Instructable Motion prior (VIM) - a Reinforcement Learning framework designed to incorporate a range of agile locomotion tasks suitable for advanced robotic applications. Our framework enables legged robots to learn diverse agile low-level skills by imitating animal motions and manually designed motions. Our Functionality reward guides the robot's ability to adopt varied skills, and our Stylization reward ensures that robot motions align with reference motions. Our evaluations of the VIM framework span both simulation and the real world. Our framework allows a robot to concurrently learn diverse agile locomotion skills using a single learning-based controller in the real world. Videos can be found on our website: https://rchalyang.github.io/VIM/

Generalized Animal Imitator: Agile Locomotion with Versatile Motion Prior

TL;DR

Abstract

Paper Structure (24 sections, 5 equations, 11 figures, 6 tables)

This paper contains 24 sections, 5 equations, 11 figures, 6 tables.

Introduction
Related Work
Learn Versatile Instructable Motion Prior (VIM)
Motion Prior Structure
Imitation Reward for Functionality and Style
Solving Downstream Tasks with Motion Prior:
Experiments
Evaluation of Learned Low-level Motion Priors
Evaluation on High-level Tasks
Limitations
Conclusion
Reference Motion Dataset
Performance Across Different Reference Motions
Additional Discussion about ASE
Additional Discussion over Skill Learning Frameworks
...and 9 more sections

Figures (11)

Figure 1: Our system learns a single instructable motion prior, from a diverse reference motion dataset.
Figure 2: VIM and Reward: Our reference motion encoder maps reference motions into latent skill space and low-level policy output motor command. $V_{low}$ is the low-level critic for RL training. Our reward encourages the robot to track the root trajectory and the joint motion of the reference motion.
Figure 3: Solving High-level Tasks with Our Motion Prior. Our high-level policy outputs high-level latent command for low-level policy.
Figure 4: Real World Backflip Trajectory: Each row represents a single trajectory (From top to bottom: Reference Motion, VIM, GAIL, Motion Imitation). Trajectories are shown from left to right.
Figure 5: Latent Skill Space t-SNE. We visualize the latent embedding for varying motion segments.
...and 6 more figures

Generalized Animal Imitator: Agile Locomotion with Versatile Motion Prior

TL;DR

Abstract

Generalized Animal Imitator: Agile Locomotion with Versatile Motion Prior

Authors

TL;DR

Abstract

Table of Contents

Figures (11)