Advancing Humanoid Locomotion: Mastering Challenging Terrains with Denoising World Model Learning
Xinyang Gu, Yen-Jen Wang, Xiang Zhu, Chengming Shi, Yanjiang Guo, Yichen Liu, Jianyu Chen
TL;DR
This work presents Denoising World Model Learning (DWL), an end-to-end reinforcement learning framework for humanoid locomotion that addresses the sim-to-real gap with an encoder–decoder world model and domain randomization. DWL enables a single learned policy to master real-world terrains—snow, stairs, irregular surfaces—via zero-shot sim-to-real transfer and active 2-DoF ankle control with a Closed Kinematic Chain mechanism. The approach combines a denoising loss, PPO-based policy optimization, and privileged information during training, achieving robust gait across indoor and outdoor environments and under substantial disturbances. The findings demonstrate significant improvements in terrain adaptation, state estimation, and ankle-assisted stability, with practical implications for deploying humanoid robots in human-centric settings.
Abstract
Humanoid robots, with their human-like skeletal structure, are especially suited for tasks in human-centric environments. However, this structure is accompanied by additional challenges in locomotion controller design, especially in complex real-world environments. As a result, existing humanoid robots are limited to relatively simple terrains, either with model-based control or model-free reinforcement learning. In this work, we introduce Denoising World Model Learning (DWL), an end-to-end reinforcement learning framework for humanoid locomotion control, which demonstrates the world's first humanoid robot to master real-world challenging terrains such as snowy and inclined land in the wild, up and down stairs, and extremely uneven terrains. All scenarios run the same learned neural network with zero-shot sim-to-real transfer, indicating the superior robustness and generalization capability of the proposed method.
