Table of Contents
Fetching ...

Humanoid-Gym: Reinforcement Learning for Humanoid Robot with Zero-Shot Sim2Real Transfer

Xinyang Gu, Yen-Jen Wang, Jianyu Chen

TL;DR

Humanoid-Gym addresses the sizable sim-to-real transfer gap in humanoid locomotion by offering an open-source RL framework built on Nvidia Isaac Gym, with domain randomization and a sim-to-sim validation path to MuJoCo. The approach uses PPO with asymmetric actor-critic and privileged information, a high-frequency PD-based action interface, and a gait-aware reward design to encourage stable locomotion. It demonstrates zero-shot sim-to-real transfer on two RobotEra humanoids (XBot-S and XBot-L) and validates policies across flat and uneven terrains, aided by MuJoCo calibration that aligns simulated dynamics with real-world behavior. The work provides an accessible, reproducible pipeline and a sim-to-sim analysis tool to bolster robustness before real-world deployment.

Abstract

Humanoid-Gym is an easy-to-use reinforcement learning (RL) framework based on Nvidia Isaac Gym, designed to train locomotion skills for humanoid robots, emphasizing zero-shot transfer from simulation to the real-world environment. Humanoid-Gym also integrates a sim-to-sim framework from Isaac Gym to Mujoco that allows users to verify the trained policies in different physical simulations to ensure the robustness and generalization of the policies. This framework is verified by RobotEra's XBot-S (1.2-meter tall humanoid robot) and XBot-L (1.65-meter tall humanoid robot) in a real-world environment with zero-shot sim-to-real transfer. The project website and source code can be found at: https://sites.google.com/view/humanoid-gym/.

Humanoid-Gym: Reinforcement Learning for Humanoid Robot with Zero-Shot Sim2Real Transfer

TL;DR

Humanoid-Gym addresses the sizable sim-to-real transfer gap in humanoid locomotion by offering an open-source RL framework built on Nvidia Isaac Gym, with domain randomization and a sim-to-sim validation path to MuJoCo. The approach uses PPO with asymmetric actor-critic and privileged information, a high-frequency PD-based action interface, and a gait-aware reward design to encourage stable locomotion. It demonstrates zero-shot sim-to-real transfer on two RobotEra humanoids (XBot-S and XBot-L) and validates policies across flat and uneven terrains, aided by MuJoCo calibration that aligns simulated dynamics with real-world behavior. The work provides an accessible, reproducible pipeline and a sim-to-sim analysis tool to bolster robustness before real-world deployment.

Abstract

Humanoid-Gym is an easy-to-use reinforcement learning (RL) framework based on Nvidia Isaac Gym, designed to train locomotion skills for humanoid robots, emphasizing zero-shot transfer from simulation to the real-world environment. Humanoid-Gym also integrates a sim-to-sim framework from Isaac Gym to Mujoco that allows users to verify the trained policies in different physical simulations to ensure the robustness and generalization of the policies. This framework is verified by RobotEra's XBot-S (1.2-meter tall humanoid robot) and XBot-L (1.65-meter tall humanoid robot) in a real-world environment with zero-shot sim-to-real transfer. The project website and source code can be found at: https://sites.google.com/view/humanoid-gym/.
Paper Structure (11 sections, 2 equations, 7 figures, 4 tables)

This paper contains 11 sections, 2 equations, 7 figures, 4 tables.

Figures (7)

  • Figure 1: Humanoid-Gym enables users to train their policies within Nvidia Isaac Gym and validate them in MuJoCo. Additionally, we have successfully tested the complete pipeline with two humanoid robots. They were trained in Humanoid-Gym and transferred to real-world environments in a zero-shot manner.
  • Figure 2: Pipeline of Humanoid-Gym. Initially, we employ massively parallel deep reinforcement learning (RL) within Nvidia Isaac Gym, incorporating diverse terrains and dynamics randomization. Subsequently, we undertake sim-to-sim transfer to test policies. Due to our meticulous calibration, the performance in both MuJoCo and real-world settings aligns closely.
  • Figure 3: Sine wave in Both MuJoCo and real-world environment. It can be found that the trajectories of the two are very close after calibration.
  • Figure 4: Phase Portrait for MuJoCo, Real-World Environment, and Isaac Gym.
  • Figure 5: Hardware Platform. Our Humanoid-Gym framework is tested on two distinct sizes of humanoid robots, XBot-S and XBot-L, provided by Robot Era.
  • ...and 2 more figures