Table of Contents
Fetching ...

Real-Time Whole-Body Control of Legged Robots with Model-Predictive Path Integral Control

Juan Alvarez-Padilla, John Z. Zhang, Sofia Kwok, John M. Dolan, Zachary Manchester

TL;DR

The paper tackles real-time whole-body control for legged robots performing locomotion and manipulation under contact-rich conditions. It adopts Model-Predictive Path Integral Control (MPPI) with cubic-spline control sampling and MuJoCo-based parallel rollouts to generate policies online without offline training. Hardware experiments on a Unitree Go1 demonstrate flat-ground walking, challenging terrain traversal, and box manipulation with emergent contact behaviors, supported by systematic ablations and sim-to-real comparisons. The work shows that gradient-free, online planning over full-body dynamics is feasible in real time, highlighting the practical potential of sampling-based MPC for legged loco-manipulation tasks.

Abstract

This paper presents a system for enabling real-time synthesis of whole-body locomotion and manipulation policies for real-world legged robots. Motivated by recent advancements in robot simulation, we leverage the efficient parallelization capabilities of the MuJoCo simulator to achieve fast sampling over the robot state and action trajectories. Our results show surprisingly effective real-world locomotion and manipulation capabilities with a very simple control strategy. We demonstrate our approach on several hardware and simulation experiments: robust locomotion over flat and uneven terrains, climbing over a box whose height is comparable to the robot, and pushing a box to a goal position. To our knowledge, this is the first successful deployment of whole-body sampling-based MPC on real-world legged robot hardware. Experiment videos and code can be found at: https://whole-body-mppi.github.io/

Real-Time Whole-Body Control of Legged Robots with Model-Predictive Path Integral Control

TL;DR

The paper tackles real-time whole-body control for legged robots performing locomotion and manipulation under contact-rich conditions. It adopts Model-Predictive Path Integral Control (MPPI) with cubic-spline control sampling and MuJoCo-based parallel rollouts to generate policies online without offline training. Hardware experiments on a Unitree Go1 demonstrate flat-ground walking, challenging terrain traversal, and box manipulation with emergent contact behaviors, supported by systematic ablations and sim-to-real comparisons. The work shows that gradient-free, online planning over full-body dynamics is feasible in real time, highlighting the practical potential of sampling-based MPC for legged loco-manipulation tasks.

Abstract

This paper presents a system for enabling real-time synthesis of whole-body locomotion and manipulation policies for real-world legged robots. Motivated by recent advancements in robot simulation, we leverage the efficient parallelization capabilities of the MuJoCo simulator to achieve fast sampling over the robot state and action trajectories. Our results show surprisingly effective real-world locomotion and manipulation capabilities with a very simple control strategy. We demonstrate our approach on several hardware and simulation experiments: robust locomotion over flat and uneven terrains, climbing over a box whose height is comparable to the robot, and pushing a box to a goal position. To our knowledge, this is the first successful deployment of whole-body sampling-based MPC on real-world legged robot hardware. Experiment videos and code can be found at: https://whole-body-mppi.github.io/
Paper Structure (23 sections, 4 equations, 7 figures, 1 table, 1 algorithm)

This paper contains 23 sections, 4 equations, 7 figures, 1 table, 1 algorithm.

Figures (7)

  • Figure 1: A Unitree Go1 robot pushing a box to a desired location with MPPI on hardware (top row) and corresponding MuJoCo simulation states (bottom row) on a single sequence. Contact-rich behaviors like body pushes and leg kicks emerge in real-time without manual pre-specification or offline policy training.
  • Figure 2: System diagram for deploying the MPPI policy on a Unitree Go1 robot. Joint target controls ($u$) are sampled at the evenly distributed knot points (black dots) and represented as a cubic spline over the planning horizon. A cost from each sample is evaluated based on the user-specified goal (yellow ball) and cost function. The first control from the control sequence with the lowest total cost (opaque orange line) is applied to the robot and repeated in a receding-horizon fashion. The robot's state is estimated using an EKF from motion-captured position and orientation, robot onboard IMU, and joint encoder measurements.
  • Figure 3: Keyframes from the Unitree Go1 robot climbing up and down a box of its own height with the MPPI policy on hardware (top row) and corresponding MuJoCo states (bottom row). The robot is tasked to reach the consecutive goals (yellow spheres) specified in the task.
  • Figure 4: Go1 robot walking in a clockwise hexagon trajectory under small to moderate model mismatch and external disturbance. More transparent robots represent earlier keyframes.
  • Figure 5: Top-down view of real-world box trajectories (magenta and grey lines) from $10$ trials of the Go1 robot pushing the box (black square) into the goal area (dashed circle) that is placed in front (left) and front-right (right) of the original box position. More transparent boxes represent earlier in the trajectory. Magenta lines represent runs where the box successfully reaches the target area while the grey lines indicate otherwise.
  • ...and 2 more figures