Learning Agile Locomotion on Risky Terrains
Chong Zhang, Nikita Rudin, David Hoeller, Marco Hutter
TL;DR
This work tackles the challenge of agile quadrupedal locomotion on risky terrains with sparse footholds by recasting locomotion as a navigation task and employing end-to-end RL. It introduces a three-pronged exploration strategy (curriculum with relaxed progression, intrinsic curiosity, and symmetry-based augmentation) and a two-stage generalist-specialist training pipeline, enabling reusable sensorimotor skills across terrains. Simulation and real-world experiments on an ANYmal-D robot show peak speeds of at least $2.5\ \,\mathrm{m/s}$ on stepping stones and beams, with successful sim-to-real transfer for two terrain types. The results demonstrate robust, diverse locomotion on challenging terrains, while acknowledging limitations in unified policy learning and reward design, guiding future work toward onboard perception and interpretable, unified policies.
Abstract
Quadruped robots have shown remarkable mobility on various terrains through reinforcement learning. Yet, in the presence of sparse footholds and risky terrains such as stepping stones and balance beams, which require precise foot placement to avoid falls, model-based approaches are often used. In this paper, we show that end-to-end reinforcement learning can also enable the robot to traverse risky terrains with dynamic motions. To this end, our approach involves training a generalist policy for agile locomotion on disorderly and sparse stepping stones before transferring its reusable knowledge to various more challenging terrains by finetuning specialist policies from it. Given that the robot needs to rapidly adapt its velocity on these terrains, we formulate the task as a navigation task instead of the commonly used velocity tracking which constrains the robot's behavior and propose an exploration strategy to overcome sparse rewards and achieve high robustness. We validate our proposed method through simulation and real-world experiments on an ANYmal-D robot achieving peak forward velocity of >= 2.5 m/s on sparse stepping stones and narrow balance beams. Video: youtu.be/Z5X0J8OH6z4
