Table of Contents
Fetching ...

ANYmal Parkour: Learning Agile Navigation for Quadrupedal Robots

David Hoeller, Nikita Rudin, Dhionis Sako, Marco Hutter

TL;DR

The paper presents a fully learned, sim-to-real pipeline for agile parkour-style navigation of a quadrupedal robot, integrating a perception module that reconstructs 3D terrain, a repertoire of low-level locomotion skills, and a high-level navigation policy that selects and times skill execution. A hybrid PPO framework enables joint optimization of continuous motor commands and discrete skill choices, while a multi-resolution perception network provides near-field detail and far-field context. The approach enables real-time, onboard operation without pre-mapped environments or expert demonstrations, achieving speeds up to 2 m/s on challenging obstacle sequences with robust perception under occlusion. Together these components demonstrate a scalable, end-to-end learned solution for parkour-like mobility with potential applications in search-and-rescue and complex terrain exploration.

Abstract

Performing agile navigation with four-legged robots is a challenging task due to the highly dynamic motions, contacts with various parts of the robot, and the limited field of view of the perception sensors. In this paper, we propose a fully-learned approach to train such robots and conquer scenarios that are reminiscent of parkour challenges. The method involves training advanced locomotion skills for several types of obstacles, such as walking, jumping, climbing, and crouching, and then using a high-level policy to select and control those skills across the terrain. Thanks to our hierarchical formulation, the navigation policy is aware of the capabilities of each skill, and it will adapt its behavior depending on the scenario at hand. Additionally, a perception module is trained to reconstruct obstacles from highly occluded and noisy sensory data and endows the pipeline with scene understanding. Compared to previous attempts, our method can plan a path for challenging scenarios without expert demonstration, offline computation, a priori knowledge of the environment, or taking contacts explicitly into account. While these modules are trained from simulated data only, our real-world experiments demonstrate successful transfer on hardware, where the robot navigates and crosses consecutive challenging obstacles with speeds of up to two meters per second. The supplementary video can be found on the project website: https://sites.google.com/leggedrobotics.com/agile-navigation

ANYmal Parkour: Learning Agile Navigation for Quadrupedal Robots

TL;DR

The paper presents a fully learned, sim-to-real pipeline for agile parkour-style navigation of a quadrupedal robot, integrating a perception module that reconstructs 3D terrain, a repertoire of low-level locomotion skills, and a high-level navigation policy that selects and times skill execution. A hybrid PPO framework enables joint optimization of continuous motor commands and discrete skill choices, while a multi-resolution perception network provides near-field detail and far-field context. The approach enables real-time, onboard operation without pre-mapped environments or expert demonstrations, achieving speeds up to 2 m/s on challenging obstacle sequences with robust perception under occlusion. Together these components demonstrate a scalable, end-to-end learned solution for parkour-like mobility with potential applications in search-and-rescue and complex terrain exploration.

Abstract

Performing agile navigation with four-legged robots is a challenging task due to the highly dynamic motions, contacts with various parts of the robot, and the limited field of view of the perception sensors. In this paper, we propose a fully-learned approach to train such robots and conquer scenarios that are reminiscent of parkour challenges. The method involves training advanced locomotion skills for several types of obstacles, such as walking, jumping, climbing, and crouching, and then using a high-level policy to select and control those skills across the terrain. Thanks to our hierarchical formulation, the navigation policy is aware of the capabilities of each skill, and it will adapt its behavior depending on the scenario at hand. Additionally, a perception module is trained to reconstruct obstacles from highly occluded and noisy sensory data and endows the pipeline with scene understanding. Compared to previous attempts, our method can plan a path for challenging scenarios without expert demonstration, offline computation, a priori knowledge of the environment, or taking contacts explicitly into account. While these modules are trained from simulated data only, our real-world experiments demonstrate successful transfer on hardware, where the robot navigates and crosses consecutive challenging obstacles with speeds of up to two meters per second. The supplementary video can be found on the project website: https://sites.google.com/leggedrobotics.com/agile-navigation
Paper Structure (30 sections, 11 figures, 7 tables)

This paper contains 30 sections, 11 figures, 7 tables.

Figures (11)

  • Figure 1: Deployment of the pipeline on the quadrupedal robot ANYmal D. The robot performs highly dynamic maneuvers and makes contacts with its limbs where necessary.
  • Figure 2: Description of our approach. We decompose the problem into three components: The perception module receives the point cloud measurements to estimate the scene's layout and produces a latent tensor and a map. The locomotion module contains several low-level skills that can overcome specific scenarios. The navigation module is given a target goal and uses the latent to plan a path and select the correct skill.
  • Figure 3: Deployment of the pipeline on the robot ANYmal D. (A) Trajectory on the real robot. (B) Trajectory in simulation. (A1)-(A3) and (B1)-(B3) depict the profiles of the robot's speed, the selected skills, and two joint angles and torques corresponding to (A) and (B), respectively. The system leverages the motor's full torque capabilities and uses large deflections of the joints to reach high speeds and overcome challenging obstacles.
  • Figure 4: Training scenarios of the locomotion skills with the resulting behaviors. (A) Jumping. (B) Climbing down. (C) Climbing up. (D) Crouching. (E) Walking. (F) Success rate of each skill for obstacles of varying difficulty. (G) Ranges of parameters used during training (0% to 100% in F).
  • Figure 5: Adaptive path selection. The robot starts on the ground and is given a target on top of the box in the back, and then commanded back to the initial position. (A) Likelihood of going up and down along the direct path (red line) as a function of the height of the box. (B) and (C) Deployment on the robot for h = 0.75m. (D) and (E) Deployment on the robot for h = 1.15m. For the same targets and box placement, the navigation policy chooses a different path depending on the height of the boxes to reach the goal.
  • ...and 6 more figures