Table of Contents
Fetching ...

Learning Agile Locomotion and Adaptive Behaviors via RL-augmented MPC

Yiyu Chen, Quan Nguyen

TL;DR

This paper unifies traditional locomotion controls that separate stance foot control and swing foot trajectory by combining RL and MPC, improving agility and robustness in locomotion with adaptive behavior and demonstrating the generalizability and robustness of the same policy.

Abstract

In the context of legged robots, adaptive behavior involves adaptive balancing and adaptive swing foot reflection. While adaptive balancing counteracts perturbations to the robot, adaptive swing foot reflection helps the robot to navigate intricate terrains without foot entrapment. In this paper, we manage to bring both aspects of adaptive behavior to quadruped locomotion by combining RL and MPC while improving the robustness and agility of blind legged locomotion. This integration leverages MPC's strength in predictive capabilities and RL's adeptness in drawing from past experiences. Unlike traditional locomotion controls that separate stance foot control and swing foot trajectory, our innovative approach unifies them, addressing their lack of synchronization. At the heart of our contribution is the synthesis of stance foot control with swing foot reflection, improving agility and robustness in locomotion with adaptive behavior. A hallmark of our approach is robust blind stair climbing through swing foot reflection. Moreover, we intentionally designed the learning module as a general plugin for different robot platforms. We trained the policy and implemented our approach on the Unitree A1 robot, achieving impressive results: a peak turn rate of 8.5 rad/s, a peak running speed of 3 m/s, and steering at a speed of 2.5 m/s. Remarkably, this framework also allows the robot to maintain stable locomotion while bearing an unexpected load of 10 kg, or 83\% of its body mass. We further demonstrate the generalizability and robustness of the same policy where it realizes zero-shot transfer to different robot platforms like Go1 and AlienGo robots for load carrying. Code is made available for the use of the research community at https://github.com/DRCL-USC/RL_augmented_MPC.git

Learning Agile Locomotion and Adaptive Behaviors via RL-augmented MPC

TL;DR

This paper unifies traditional locomotion controls that separate stance foot control and swing foot trajectory by combining RL and MPC, improving agility and robustness in locomotion with adaptive behavior and demonstrating the generalizability and robustness of the same policy.

Abstract

In the context of legged robots, adaptive behavior involves adaptive balancing and adaptive swing foot reflection. While adaptive balancing counteracts perturbations to the robot, adaptive swing foot reflection helps the robot to navigate intricate terrains without foot entrapment. In this paper, we manage to bring both aspects of adaptive behavior to quadruped locomotion by combining RL and MPC while improving the robustness and agility of blind legged locomotion. This integration leverages MPC's strength in predictive capabilities and RL's adeptness in drawing from past experiences. Unlike traditional locomotion controls that separate stance foot control and swing foot trajectory, our innovative approach unifies them, addressing their lack of synchronization. At the heart of our contribution is the synthesis of stance foot control with swing foot reflection, improving agility and robustness in locomotion with adaptive behavior. A hallmark of our approach is robust blind stair climbing through swing foot reflection. Moreover, we intentionally designed the learning module as a general plugin for different robot platforms. We trained the policy and implemented our approach on the Unitree A1 robot, achieving impressive results: a peak turn rate of 8.5 rad/s, a peak running speed of 3 m/s, and steering at a speed of 2.5 m/s. Remarkably, this framework also allows the robot to maintain stable locomotion while bearing an unexpected load of 10 kg, or 83\% of its body mass. We further demonstrate the generalizability and robustness of the same policy where it realizes zero-shot transfer to different robot platforms like Go1 and AlienGo robots for load carrying. Code is made available for the use of the research community at https://github.com/DRCL-USC/RL_augmented_MPC.git
Paper Structure (17 sections, 6 equations, 7 figures, 1 table)

This paper contains 17 sections, 6 equations, 7 figures, 1 table.

Figures (7)

  • Figure 1: Experiment result highlights. a) High-speed steering in place; b) High-speed running; c) High-speed running and steering; d) Generalization of the same policy across different robot platforms; e) Transition between soft and hard terrain. Experiment video: https://www.youtube.com/watch?v=HxSIxTnEw08
  • Figure 2: System architecture of the proposed framework. The high-level module, framed in blue, includes the adaptive behavior policy and locomotion control module, operating at 33Hz. The low-level module, running at 1kHz, includes leg control (using Jacobian and IK), state estimation, and the robot's hardware. The $F/m$ block normalizes the MPC force command into accelerations as a robot-agnostic input to the adaptive behavior policy.
  • Figure 3: Yaw rate plot of high speed turning policy from IMU data
  • Figure 4: Linear velocity in the body frame of high-speed running policy
  • Figure 5: Linear velocity in body frame and yaw rate of high-speed running and steering policy
  • ...and 2 more figures