Table of Contents
Fetching ...

SMAP: Self-supervised Motion Adaptation for Physically Plausible Humanoid Whole-body Control

Haoyu Zhao, Sixu Lin, Qingwei Ben, Minyue Dai, Hao Fei, Jingbo Wang, Hua Zou, Junting Dong

TL;DR

SMAP addresses the challenge of transferring human motion to humanoid robots by bridging the heterogeneous action spaces with a vector-quantized periodic autoencoder (Humanoid-Adapter) that maps human motion to physically plausible humanoid actions. It combines this with progressive control policy learning via a privileged teacher and decoupled rewards to achieve stable, high-fidelity whole-body control in sim-to-real settings, demonstrated on the Unitree H1. Key contributions include the Humanoid-Adapter for cross-domain motion alignment, a phase-manifold-based motion representation, and a two-stage teacher-student training regimen that improves convergence and robustness. The approach enables reliable imitation of diverse human motions on real humanoids, with practical implications for scalable and safe humanoid manipulation in human environments.

Abstract

This paper presents a novel framework that enables real-world humanoid robots to maintain stability while performing human-like motion. Current methods train a policy which allows humanoid robots to follow human body using the massive retargeted human data via reinforcement learning. However, due to the heterogeneity between human and humanoid robot motion, directly using retargeted human motion reduces training efficiency and stability. To this end, we introduce SMAP, a novel whole-body tracking framework that bridges the gap between human and humanoid action spaces, enabling accurate motion mimicry by humanoid robots. The core idea is to use a vector-quantized periodic autoencoder to capture generic atomic behaviors and adapt human motion into physically plausible humanoid motion. This adaptation accelerates training convergence and improves stability when handling novel or challenging motions. We then employ a privileged teacher to distill precise mimicry skills into the student policy with a proposed decoupled reward. We conduct experiments in simulation and real world to demonstrate the superiority stability and performance of SMAP over SOTA methods, offering practical guidelines for advancing whole-body control in humanoid robots.

SMAP: Self-supervised Motion Adaptation for Physically Plausible Humanoid Whole-body Control

TL;DR

SMAP addresses the challenge of transferring human motion to humanoid robots by bridging the heterogeneous action spaces with a vector-quantized periodic autoencoder (Humanoid-Adapter) that maps human motion to physically plausible humanoid actions. It combines this with progressive control policy learning via a privileged teacher and decoupled rewards to achieve stable, high-fidelity whole-body control in sim-to-real settings, demonstrated on the Unitree H1. Key contributions include the Humanoid-Adapter for cross-domain motion alignment, a phase-manifold-based motion representation, and a two-stage teacher-student training regimen that improves convergence and robustness. The approach enables reliable imitation of diverse human motions on real humanoids, with practical implications for scalable and safe humanoid manipulation in human environments.

Abstract

This paper presents a novel framework that enables real-world humanoid robots to maintain stability while performing human-like motion. Current methods train a policy which allows humanoid robots to follow human body using the massive retargeted human data via reinforcement learning. However, due to the heterogeneity between human and humanoid robot motion, directly using retargeted human motion reduces training efficiency and stability. To this end, we introduce SMAP, a novel whole-body tracking framework that bridges the gap between human and humanoid action spaces, enabling accurate motion mimicry by humanoid robots. The core idea is to use a vector-quantized periodic autoencoder to capture generic atomic behaviors and adapt human motion into physically plausible humanoid motion. This adaptation accelerates training convergence and improves stability when handling novel or challenging motions. We then employ a privileged teacher to distill precise mimicry skills into the student policy with a proposed decoupled reward. We conduct experiments in simulation and real world to demonstrate the superiority stability and performance of SMAP over SOTA methods, offering practical guidelines for advancing whole-body control in humanoid robots.

Paper Structure

This paper contains 16 sections, 4 equations, 10 figures, 1 table.

Figures (10)

  • Figure 1: Our framework enables humanoid robot execute various expressive whole-body motions. The robot can (a) turn around and walk forward, (b) wave hello, (c) swing arms while advancing, (d) jump on one leg, (e) walk fast.
  • Figure 2: t-SNE visualization of the distribution of retargeted human motion, humanoid robot motion (recorded within the simulator), and motion adapted by Humanoid-Adapter on the CMU MoCap dataset cmu_mocap.
  • Figure 3: Pipeline of SMAP . Given human motion, we use the proposed Humanoid-Adapter (details shown in Fig. \ref{['fig:adapter']}), pre-trained () to adapt human motion into corresponding, physically plausible humanoid robot motion. Our sim-to-real policy () is distilled via imitation learning from an RL-trained privileged teacher policy that leverages privileged information with proposed decoupled reward. The policy is transferred to the real world.
  • Figure 4: Humanoid-Adapter. To align heterogeneous human motion $\mathcal{S}^h$ and humanoid robot motion $\mathcal{S}^r$, we train two VQ-PAEs () on them to learn a shared phase manifold using a codebook.
  • Figure 5: Qualitative results on the H1 robot in simulation.
  • ...and 5 more figures