H-Zero: Cross-Humanoid Locomotion Pretraining Enables Few-shot Novel Embodiment Transfer
Yunfeng Lin, Minghuan Liu, Yufei Xue, Ming Zhou, Yong Yu, Jiangmiao Pang, Weinan Zhang
TL;DR
H-Zero introduces a cross-embodiment locomotion pretraining framework that learns a unified base policy for humanoids by standardizing control semantics, diversifying embodied morphologies, and applying embodiment-aware learning. The pretrained policy demonstrates zero-shot and few-shot transfer to unseen robots, with efficient fine-tuning times and robust sim-to-real transfer. Key contributions include a hardware-agnostic joint representation, embodiment descriptors, and dynamic, per-embodiment training strategies that balance exploration and learning progress. The approach substantially improves transferability over single-embodiment training and offers a scalable path toward general humanoid locomotion across diverse platforms.
Abstract
The rapid advancement of humanoid robotics has intensified the need for robust and adaptable controllers to enable stable and efficient locomotion across diverse platforms. However, developing such controllers remains a significant challenge because existing solutions are tailored to specific robot designs, requiring extensive tuning of reward functions, physical parameters, and training hyperparameters for each embodiment. To address this challenge, we introduce H-Zero, a cross-humanoid locomotion pretraining pipeline that learns a generalizable humanoid base policy. We show that pretraining on a limited set of embodiments enables zero-shot and few-shot transfer to novel humanoid robots with minimal fine-tuning. Evaluations show that the pretrained policy maintains up to 81% of the full episode duration on unseen robots in simulation while enabling few-shot transfer to unseen humanoids and upright quadrupeds within 30 minutes of fine-tuning.
