Table of Contents
Fetching ...

ASAP: Aligning Simulation and Real-World Physics for Learning Agile Humanoid Whole-Body Skills

Tairan He, Jiawei Gao, Wenli Xiao, Yuanhang Zhang, Zi Wang, Jiashun Wang, Zhengyi Luo, Guanqi He, Nikhil Sobanbab, Chaoyi Pan, Zeji Yi, Guannan Qu, Kris Kitani, Jessica Hodgins, Linxi "Jim" Fan, Yuke Zhu, Changliu Liu, Guanya Shi

TL;DR

ASAP tackles the sim-to-real dynamics mismatch in agile humanoid control with a two-stage approach: pre-train a phase-based motion-tracking policy in simulation from retargeted human motions, then collect real-world data to learn a delta action model that compensates for dynamics gaps and fine-tune the policy in simulation. The delta model is subsequently deployed to improve real-world performance, and the method is validated across sim-to-sim and sim-to-real transfers, including Unitree G1 hardware, showing substantial reductions in tracking error compared with SysID, DR, and delta-dynamics baselines. The work demonstrates the practicality of residual dynamics learning for bridging simulation and reality in high-dynamics humanoid tasks and provides an open-source multi-simulator framework to accelerate future research. Overall, ASAP enables more expressive and agile humanoid motions by marrying simulation-based pre-training with data-driven dynamics correction learned from real-world rollouts.

Abstract

Humanoid robots hold the potential for unparalleled versatility in performing human-like, whole-body skills. However, achieving agile and coordinated whole-body motions remains a significant challenge due to the dynamics mismatch between simulation and the real world. Existing approaches, such as system identification (SysID) and domain randomization (DR) methods, often rely on labor-intensive parameter tuning or result in overly conservative policies that sacrifice agility. In this paper, we present ASAP (Aligning Simulation and Real-World Physics), a two-stage framework designed to tackle the dynamics mismatch and enable agile humanoid whole-body skills. In the first stage, we pre-train motion tracking policies in simulation using retargeted human motion data. In the second stage, we deploy the policies in the real world and collect real-world data to train a delta (residual) action model that compensates for the dynamics mismatch. Then, ASAP fine-tunes pre-trained policies with the delta action model integrated into the simulator to align effectively with real-world dynamics. We evaluate ASAP across three transfer scenarios: IsaacGym to IsaacSim, IsaacGym to Genesis, and IsaacGym to the real-world Unitree G1 humanoid robot. Our approach significantly improves agility and whole-body coordination across various dynamic motions, reducing tracking error compared to SysID, DR, and delta dynamics learning baselines. ASAP enables highly agile motions that were previously difficult to achieve, demonstrating the potential of delta action learning in bridging simulation and real-world dynamics. These results suggest a promising sim-to-real direction for developing more expressive and agile humanoids.

ASAP: Aligning Simulation and Real-World Physics for Learning Agile Humanoid Whole-Body Skills

TL;DR

ASAP tackles the sim-to-real dynamics mismatch in agile humanoid control with a two-stage approach: pre-train a phase-based motion-tracking policy in simulation from retargeted human motions, then collect real-world data to learn a delta action model that compensates for dynamics gaps and fine-tune the policy in simulation. The delta model is subsequently deployed to improve real-world performance, and the method is validated across sim-to-sim and sim-to-real transfers, including Unitree G1 hardware, showing substantial reductions in tracking error compared with SysID, DR, and delta-dynamics baselines. The work demonstrates the practicality of residual dynamics learning for bridging simulation and reality in high-dynamics humanoid tasks and provides an open-source multi-simulator framework to accelerate future research. Overall, ASAP enables more expressive and agile humanoid motions by marrying simulation-based pre-training with data-driven dynamics correction learned from real-world rollouts.

Abstract

Humanoid robots hold the potential for unparalleled versatility in performing human-like, whole-body skills. However, achieving agile and coordinated whole-body motions remains a significant challenge due to the dynamics mismatch between simulation and the real world. Existing approaches, such as system identification (SysID) and domain randomization (DR) methods, often rely on labor-intensive parameter tuning or result in overly conservative policies that sacrifice agility. In this paper, we present ASAP (Aligning Simulation and Real-World Physics), a two-stage framework designed to tackle the dynamics mismatch and enable agile humanoid whole-body skills. In the first stage, we pre-train motion tracking policies in simulation using retargeted human motion data. In the second stage, we deploy the policies in the real world and collect real-world data to train a delta (residual) action model that compensates for the dynamics mismatch. Then, ASAP fine-tunes pre-trained policies with the delta action model integrated into the simulator to align effectively with real-world dynamics. We evaluate ASAP across three transfer scenarios: IsaacGym to IsaacSim, IsaacGym to Genesis, and IsaacGym to the real-world Unitree G1 humanoid robot. Our approach significantly improves agility and whole-body coordination across various dynamic motions, reducing tracking error compared to SysID, DR, and delta dynamics learning baselines. ASAP enables highly agile motions that were previously difficult to achieve, demonstrating the potential of delta action learning in bridging simulation and real-world dynamics. These results suggest a promising sim-to-real direction for developing more expressive and agile humanoids.

Paper Structure

This paper contains 39 sections, 7 equations, 12 figures, 7 tables.

Figures (12)

  • Figure 1: The humanoid robot (Unitree G1) demonstrates diverse agile whole-body skills, showcasing the control policies' agility: (a) Cristiano Ronaldo’s signature celebration involving a jump with a 180-degree mid-air rotation; (b) LeBron James’s "Silencer" celebration involving single-leg balancing; and (c) Kobe Bryant’s famous fadeaway jump shot involving single-leg jumping and landing; (d) 1.5m-forward jumping; (e) Leg stretching; (f) 1.3m-side jumping.
  • Figure 3: Retargeting Human Video Motions to Robot Motions: (a) Human motions are captured from video. (b) Using TRAM wang2025tram, 3D human motion is reconstructed in the SMPL parameter format. (c) A reinforcement learning (RL) policy is trained in simulation to track the SMPL motion. (d) The learned SMPL motion is retargeted to the Unitree G1 humanoid robot in simulation. (e) The trained RL policy is deployed on the real robot, executing the final motion in the physical world. This pipeline ensures the retargeted motions remain physically feasible and suitable for real-world deployment.
  • Figure 4: Baselines of ASAP. (a) Model-free RL training. (b) System ID from real to sim using real-world data. (c) Learning delta dynamics model using real-world data. (d) Our proposed method, learning delta action model using real-world data.
  • Figure 5: Replaying IsaacSim State-Action trajecories in IsaacGym. The upper four panels visualize the Unitree G1 humanoid executing a soccer-shooting motion under four distinct open-loop actions. Corresponding metric curves (bottom) quantify tracking performance. Importantly, our delta action model (ASAP) is trained across multiple motions and is not overfitted to this specific example.
  • Figure 6: Visual comparisons of motion imitation results across different difficulty levels (Easy, Medium, Hard) for various tasks including Jump Forward, Side Jump, Single Foot Balance, Squat, Step Backward, Step Forward, and Walk.
  • ...and 7 more figures