Towards Adaptive Humanoid Control via Multi-Behavior Distillation and Reinforced Fine-Tuning

Yingnan Zhao; Xinmiao Wang; Dewei Wang; Xinzhe Liu; Dan Lu; Qilong Han; Peng Liu; Chenjia Bai

Towards Adaptive Humanoid Control via Multi-Behavior Distillation and Reinforced Fine-Tuning

Yingnan Zhao, Xinmiao Wang, Dewei Wang, Xinzhe Liu, Dan Lu, Qilong Han, Peng Liu, Chenjia Bai

TL;DR

AHC tackles the challenge of achieving versatile humanoid locomotion by learning a single controller capable of standing up, walking, and recovering across diverse terrains. It introduces a two-stage framework: first distilling behavior-specific policies into a basic multi-behavior policy guided by Adversarial Motion Prior, then fine-tuning with reinforcement learning on varied terrains using gradient projection and a terrain curriculum. The approach yields a terrain-adaptive controller that transfers from simulation to a real Unitree G1, capable of recovering from falls and maintaining stable locomotion under disturbances. This work provides a practical path toward generalizable humanoid locomotion without training separate policies for every skill and terrain.

Abstract

Humanoid robots are promising to learn a diverse set of human-like locomotion behaviors, including standing up, walking, running, and jumping. However, existing methods predominantly require training independent policies for each skill, yielding behavior-specific controllers that exhibit limited generalization and brittle performance when deployed on irregular terrains and in diverse situations. To address this challenge, we propose Adaptive Humanoid Control (AHC) that adopts a two-stage framework to learn an adaptive humanoid locomotion controller across different skills and terrains. Specifically, we first train several primary locomotion policies and perform a multi-behavior distillation process to obtain a basic multi-behavior controller, facilitating adaptive behavior switching based on the environment. Then, we perform reinforced fine-tuning by collecting online feedback in performing adaptive behaviors on more diverse terrains, enhancing terrain adaptability for the controller. We conduct experiments in both simulation and real-world experiments in Unitree G1 robots. The results show that our method exhibits strong adaptability across various situations and terrains. Project website: https://ahc-humanoid.github.io.

Towards Adaptive Humanoid Control via Multi-Behavior Distillation and Reinforced Fine-Tuning

TL;DR

Abstract

Towards Adaptive Humanoid Control via Multi-Behavior Distillation and Reinforced Fine-Tuning

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (7)