Robot Parkour Learning

Ziwen Zhuang; Zipeng Fu; Jianren Wang; Christopher Atkeson; Soeren Schwertfeger; Chelsea Finn; Hang Zhao

Robot Parkour Learning

Ziwen Zhuang, Zipeng Fu, Jianren Wang, Christopher Atkeson, Soeren Schwertfeger, Chelsea Finn, Hang Zhao

TL;DR

<3-5 sentence high-level summary>We present an end-to-end, vision-based parkour system for low-cost quadrupeds that learns diverse skills (climb, leap, crawl, tilt, run) using a two-stage RL curriculum with soft dynamics constraints followed by hard dynamics constraints, then distills these into a single vision-based parkour policy via DAgger for onboard deployment. The method leverages privileged simulation information during training, depth-based vision, and automatic curricula to overcome difficult exploration, achieving robust sim-to-real transfer. Extensive simulation and real-world experiments on Unitree A1/Go1 demonstrate the robot autonomously selects and executes appropriate parkour skills in indoor and outdoor environments, with competitive success rates and high robustness. The work contributes an open-source framework, a principled two-stage learning approach, and strong empirical evidence that end-to-end vision-based parkour is feasible on affordable hardware.

Abstract

Parkour is a grand challenge for legged locomotion that requires robots to overcome various obstacles rapidly in complex environments. Existing methods can generate either diverse but blind locomotion skills or vision-based but specialized skills by using reference animal data or complex rewards. However, autonomous parkour requires robots to learn generalizable skills that are both vision-based and diverse to perceive and react to various scenarios. In this work, we propose a system for learning a single end-to-end vision-based parkour policy of diverse parkour skills using a simple reward without any reference motion data. We develop a reinforcement learning method inspired by direct collocation to generate parkour skills, including climbing over high obstacles, leaping over large gaps, crawling beneath low barriers, squeezing through thin slits, and running. We distill these skills into a single vision-based parkour policy and transfer it to a quadrupedal robot using its egocentric depth camera. We demonstrate that our system can empower two different low-cost robots to autonomously select and execute appropriate parkour skills to traverse challenging real-world environments.

Robot Parkour Learning

TL;DR

Abstract

Paper Structure (32 sections, 5 equations, 9 figures, 11 tables)

This paper contains 32 sections, 5 equations, 9 figures, 11 tables.

Introduction
Related Work
Agile Locomotion.
Vision-Based Locomotion.
Robot Parkour Learning Systems
Parkour Skills Learning via Two-Stage RL
RL Pre-training with Soft Dynamics Constraints.
RL Fine-tuning with Hard Dynamics Constraints.
Learning a Single Parkour Policy by Distillation
Sim-to-Real and Deployment
Experimental Results
Robot and Simulation Setup.
Baselines and Ablations.
Simulation Experiments
Vision is crucial for learning parkour.
...and 17 more sections

Figures (9)

Figure 1: We present a framework for learning parkour skills on low-cost robots. Our end-to-end vision-based parkour learning system enable the robot to climb high obstacles, leap over large gaps, crawl beneath low barriers, squeeze through thin slits and run. Videos are on the \website.
Figure 2: We illustrate the challenging obstacles that our system can solve, including climbing high obstacles of 0.40m (1.53x robot height), leap over large gaps of 0.60m (1.5x robot length), crawling beneath low barriers of 0.2m (0.76x robot height), squeezing through thin slits of 0.28m by tilting (less than the robot width).
Figure 3: Soft dynamics constraints and hard dynamics constraints for each skill. Given soft dynamics constraints, the obstacles are penetrable.
Figure 4: We show collisions points on the robot. Collision points that penetrate obstacles are in red.
Figure 5: We bridge the visual gap between simulation and real world by applying pre-processing techniques. We use depth clipping, Gaussian noise and random artifacts in simulation, and depth clipping and hole-filling, spatial and temporal filters in the real world.
...and 4 more figures

Robot Parkour Learning

TL;DR

Abstract

Robot Parkour Learning

Authors

TL;DR

Abstract

Table of Contents

Figures (9)