Sim-to-Real of Humanoid Locomotion Policies via Joint Torque Space Perturbation Injection
Woohyun Cha, Junhyeok Cha, Jaeyong Shin, Donghyeon Kim, Jaeheung Park
TL;DR
The paper tackles the sim-to-real challenge for humanoid locomotion by addressing the limitations of domain randomization. It introduces a perturbation-injection mechanism in joint torque space, where a neural network τ_φ generates state-dependent torque disturbances, randomized per episode, and added to the policy torque τ_π during forward simulation. Trained with PPO, AMP, and a gradient penalty, the method leverages privileged observations and a motion-imitation objective to learn stable, natural gaits for TOCABI while being robust to unseen actuator and contact dynamics. Experimental results in both simulation and the real robot demonstrate superior robustness to complex reality gaps compared with DR and random force injection baselines, with no loss in nominal task performance. The approach promises broader applicability to other high‑dimensional robotic systems by enabling more expressive modeling of unmodeled dynamics during training.
Abstract
This paper proposes a novel alternative to existing sim-to-real methods for training control policies with simulated experiences. Prior sim-to-real methods for legged robots mostly rely on the domain randomization approach, where a fixed finite set of simulation parameters is randomized during training. Instead, our method adds state-dependent perturbations to the input joint torque used for forward simulation during the training phase. These state-dependent perturbations are designed to simulate a broader range of reality gaps than those captured by randomizing a fixed set of simulation parameters. Experimental results show that our method enables humanoid locomotion policies that achieve greater robustness against complex reality gaps unseen in the training domain.
