Adversarial Locomotion and Motion Imitation for Humanoid Policy Learning

Jiyuan Shi; Xinzhe Liu; Dewei Wang; Ouyang Lu; Sören Schwertfeger; Chi Zhang; Fuchun Sun; Chenjia Bai; Xuelong Li

Adversarial Locomotion and Motion Imitation for Humanoid Policy Learning

Jiyuan Shi, Xinzhe Liu, Dewei Wang, Ouyang Lu, Sören Schwertfeger, Chi Zhang, Fuchun Sun, Chenjia Bai, Xuelong Li

TL;DR

ALMI addresses the challenge of human-like whole-body coordination by decoupling learning for locomotion and motion imitation through an adversarial training framework. It introduces a dual-curriculum, two-player Markov-game approach and the ALMI-X dataset to support language-guided, end-to-end humanoid control and foundation-model research. Empirical results in simulation and on the Unitree H1-2 demonstrate improved robustness and tracking accuracy over baselines, and the ALMI-X dataset enables preliminary foundation-model exploration. The work also highlights promising avenues for end-to-end humanoid control while acknowledging limitations in highly dynamic tasks and sim-to-real data efficiency, suggesting future improvements in unified rewards and data-efficient foundation models.

Abstract

Humans exhibit diverse and expressive whole-body movements. However, attaining human-like whole-body coordination in humanoid robots remains challenging, as conventional approaches that mimic whole-body motions often neglect the distinct roles of upper and lower body. This oversight leads to computationally intensive policy learning and frequently causes robot instability and falls during real-world execution. To address these issues, we propose Adversarial Locomotion and Motion Imitation (ALMI), a novel framework that enables adversarial policy learning between upper and lower body. Specifically, the lower body aims to provide robust locomotion capabilities to follow velocity commands while the upper body tracks various motions. Conversely, the upper-body policy ensures effective motion tracking when the robot executes velocity-based movements. Through iterative updates, these policies achieve coordinated whole-body control, which can be extended to loco-manipulation tasks with teleoperation systems. Extensive experiments demonstrate that our method achieves robust locomotion and precise motion tracking in both simulation and on the full-size Unitree H1 robot. Additionally, we release a large-scale whole-body motion control dataset featuring high-quality episodic trajectories from MuJoCo simulations deployable on real robots. The project page is https://almi-humanoid.github.io.

Adversarial Locomotion and Motion Imitation for Humanoid Policy Learning

TL;DR

Abstract

Adversarial Locomotion and Motion Imitation for Humanoid Policy Learning

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (10)

Theorems & Definitions (3)