HumanMimic: Learning Natural Locomotion and Transitions for Humanoid Robot via Wasserstein Adversarial Imitation
Annan Tang, Takuma Hiraoka, Naoki Hiraoka, Fan Shi, Kento Kawaharazuka, Kunio Kojima, Kei Okada, Masayuki Inaba
TL;DR
The paper introduces a Wasserstein adversarial imitation framework with a soft boundary constraint to stabilize training for humanoid locomotion. It couples a unified primitive-skeleton motion retargeting pipeline with velocity-conditioned RL, enabling the full-sized humanoid JAXON to imitate diverse human locomotion and achieve seamless transitions as velocity commands change. Key contributions include the soft-boundary Wasserstein-1 critic, a practical motion retargeting method, and demonstrations of natural gait patterns and transitions in high-fidelity simulation, plus sim-to-sim transfer readiness for real-world deployment. The work has potential to reduce reward engineering and improve robustness of humanoid locomotion in real-world scenarios by addressing mode collapse and cross-morphology transfer challenges.
Abstract
Transferring human motion skills to humanoid robots remains a significant challenge. In this study, we introduce a Wasserstein adversarial imitation learning system, allowing humanoid robots to replicate natural whole-body locomotion patterns and execute seamless transitions by mimicking human motions. First, we present a unified primitive-skeleton motion retargeting to mitigate morphological differences between arbitrary human demonstrators and humanoid robots. An adversarial critic component is integrated with Reinforcement Learning (RL) to guide the control policy to produce behaviors aligned with the data distribution of mixed reference motions. Additionally, we employ a specific Integral Probabilistic Metric (IPM), namely the Wasserstein-1 distance with a novel soft boundary constraint to stabilize the training process and prevent mode collapse. Our system is evaluated on a full-sized humanoid JAXON in the simulator. The resulting control policy demonstrates a wide range of locomotion patterns, including standing, push-recovery, squat walking, human-like straight-leg walking, and dynamic running. Notably, even in the absence of transition motions in the demonstration dataset, robots showcase an emerging ability to transit naturally between distinct locomotion patterns as desired speed changes.
