EGM: Efficiently Learning General Motion Tracking Policy for High Dynamic Humanoid Whole-Body Control

Chao Yang; Yingkai Sun; Peng Ye; Xin Chen; Chong Yu; Tao Chen

EGM: Efficiently Learning General Motion Tracking Policy for High Dynamic Humanoid Whole-Body Control

Chao Yang, Yingkai Sun, Peng Ye, Xin Chen, Chong Yu, Tao Chen

TL;DR

EGM tackles the challenge of learning a universal humanoid motion-tracking policy with limited high-quality data. It introduces four core designs: Bin-based Cross-motion Curriculum Adaptive Sampling, CDMoE, data-quality–driven data curation, and a three-stage curriculum training flow. The method yields a data-efficient policy trained on 4.08 hours that generalizes to 49.25 hours of test motions and outperforms baselines on both routine and highly dynamic tasks. This work offers a practical pathway toward robust, generalizable humanoid control in real-world variability.

Abstract

Learning a general motion tracking policy from human motions shows great potential for versatile humanoid whole-body control. Conventional approaches are not only inefficient in data utilization and training processes but also exhibit limited performance when tracking highly dynamic motions. To address these challenges, we propose EGM, a framework that enables efficient learning of a general motion tracking policy. EGM integrates four core designs. Firstly, we introduce a Bin-based Cross-motion Curriculum Adaptive Sampling strategy to dynamically orchestrate the sampling probabilities based on tracking error of each motion bin, eficiently balancing the training process across motions with varying dificulty and durations. The sampled data is then processed by our proposed Composite Decoupled Mixture-of-Experts (CDMoE) architecture, which efficiently enhances the ability to track motions from different distributions by grouping experts separately for upper and lower body and decoupling orthogonal experts from shared experts to separately handle dedicated features and general features. Central to our approach is a key insight we identified: for training a general motion tracking policy, data quality and diversity are paramount. Building on these designs, we develop a three-stage curriculum training flow to progressively enhance the policy's robustness against disturbances. Despite training on only 4.08 hours of data, EGM generalized robustly across 49.25 hours of test motions, outperforming baselines on both routine and highly dynamic tasks.

EGM: Efficiently Learning General Motion Tracking Policy for High Dynamic Humanoid Whole-Body Control

TL;DR

Abstract

EGM: Efficiently Learning General Motion Tracking Policy for High Dynamic Humanoid Whole-Body Control

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (6)