HOVER: Versatile Neural Whole-Body Controller for Humanoid Robots
Tairan He, Wenli Xiao, Toru Lin, Zhengyi Luo, Zhenjia Xu, Zhenyu Jiang, Jan Kautz, Changliu Liu, Guanya Shi, Xiaolong Wang, Linxi Fan, Yuke Zhu
TL;DR
Humanoid whole-body control has been held back by mode-specific controllers that hinder transfer between locomotion, manipulation, and tabletop tasks. The authors propose HOVER, a unified neural controller that supports many control modes by distilling a high-capability oracle motion imitator trained on large-scale MoCap data and retargeted to a humanoid. By using a unified command space and a mode/mask-based distillation pipeline, HOVER shares core motor skills across modes and enables seamless transitions without retraining. Empirical results in simulation and on a real Unitree H1 show that HOVER outperforms specialist baselines and a competitive multi-mode RL approach across diverse metrics and demonstrates robust real-world operation and mode-switching capabilities.
Abstract
Humanoid whole-body control requires adapting to diverse tasks such as navigation, loco-manipulation, and tabletop manipulation, each demanding a different mode of control. For example, navigation relies on root velocity tracking, while tabletop manipulation prioritizes upper-body joint angle tracking. Existing approaches typically train individual policies tailored to a specific command space, limiting their transferability across modes. We present the key insight that full-body kinematic motion imitation can serve as a common abstraction for all these tasks and provide general-purpose motor skills for learning multiple modes of whole-body control. Building on this, we propose HOVER (Humanoid Versatile Controller), a multi-mode policy distillation framework that consolidates diverse control modes into a unified policy. HOVER enables seamless transitions between control modes while preserving the distinct advantages of each, offering a robust and scalable solution for humanoid control across a wide range of modes. By eliminating the need for policy retraining for each control mode, our approach improves efficiency and flexibility for future humanoid applications.
