BeamDojo: Learning Agile Humanoid Locomotion on Sparse Footholds
Huayi Wang, Zirui Wang, Junli Ren, Qingwei Ben, Tao Huang, Weinan Zhang, Jiangmiao Pang
TL;DR
BeamDojo presents a novel two-stage reinforcement learning framework for humanoid locomotion on sparse footholds, addressing the challenges of polygonal feet and sparse foothold rewards with a sampling-based foothold reward and a double-critic architecture. The method couples Stage 1 soft-terrain exploration on flat proxies with Stage 2 hard-terrain fine-tuning, guided by perceptual terrain information from a LiDAR-based elevation map and reinforced by a terrain-aware curriculum. Empirical results in simulation and on a Unitree G1 demonstrate high success rates, precise foothold placement, and robust performance under disturbances, with strong sim-to-real transfer aided by domain randomization. The work also analyzes design choices for foothold rewards, curricula, and perception strategies, highlighting the importance of elevation-map-based perception and two-stage training for generalization to non-flat terrains and real-world variability.
Abstract
Traversing risky terrains with sparse footholds poses a significant challenge for humanoid robots, requiring precise foot placements and stable locomotion. Existing learning-based approaches often struggle on such complex terrains due to sparse foothold rewards and inefficient learning processes. To address these challenges, we introduce BeamDojo, a reinforcement learning (RL) framework designed for enabling agile humanoid locomotion on sparse footholds. BeamDojo begins by introducing a sampling-based foothold reward tailored for polygonal feet, along with a double critic to balancing the learning process between dense locomotion rewards and sparse foothold rewards. To encourage sufficient trial-and-error exploration, BeamDojo incorporates a two-stage RL approach: the first stage relaxes the terrain dynamics by training the humanoid on flat terrain while providing it with task-terrain perceptive observations, and the second stage fine-tunes the policy on the actual task terrain. Moreover, we implement a onboard LiDAR-based elevation map to enable real-world deployment. Extensive simulation and real-world experiments demonstrate that BeamDojo achieves efficient learning in simulation and enables agile locomotion with precise foot placement on sparse footholds in the real world, maintaining a high success rate even under significant external disturbances.
