Agile But Safe: Learning Collision-Free High-Speed Legged Locomotion
Tairan He, Chong Zhang, Wenli Xiao, Guanqi He, Changliu Liu, Guanya Shi
TL;DR
ABS presents a dual-policy framework for agile yet safe legged locomotion, combining a learning-based agile policy with a policy-conditioned reach-avoid value network and a recovery policy guided by RA gradients. The system uses a low-dimensional exteroceptive ray representation and a ray-prediction network to enable real-time collision avoidance with onboard sensing. Training occurs entirely in simulation with domain randomization and curriculum, enabling direct deployment on a Unitree Go1 with onboard computation. Real-world experiments demonstrate high speeds and strong safety across indoor and outdoor environments, and extensive analyses reveal the design choices that balance agility, safety, perception, and sim-to-real transfer. The work advances safe, high-speed locomotion by integrating model-free learning with control-theoretic safety principles in a closed-loop, policy-conditioned framework.
Abstract
Legged robots navigating cluttered environments must be jointly agile for efficient task execution and safe to avoid collisions with obstacles or humans. Existing studies either develop conservative controllers (< 1.0 m/s) to ensure safety, or focus on agility without considering potentially fatal collisions. This paper introduces Agile But Safe (ABS), a learning-based control framework that enables agile and collision-free locomotion for quadrupedal robots. ABS involves an agile policy to execute agile motor skills amidst obstacles and a recovery policy to prevent failures, collaboratively achieving high-speed and collision-free navigation. The policy switch in ABS is governed by a learned control-theoretic reach-avoid value network, which also guides the recovery policy as an objective function, thereby safeguarding the robot in a closed loop. The training process involves the learning of the agile policy, the reach-avoid value network, the recovery policy, and an exteroception representation network, all in simulation. These trained modules can be directly deployed in the real world with onboard sensing and computation, leading to high-speed and collision-free navigation in confined indoor and outdoor spaces with both static and dynamic obstacles.
