Benchmarking Model Predictive Control and Reinforcement Learning Based Control for Legged Robot Locomotion in MuJoCo Simulation
Shivayogi Akki, Tan Chen
TL;DR
This study benchmarked MPC and RL controllers for legged locomotion on the Unitree Go1 in MuJoCo, focusing on straight walking at $0.5~m/s$. It shows RL achieves superior disturbance rejection and lower CoT, aided by high-frequency actions and knee-driven propulsion, while MPC offers more stable recovery from large perturbations through balanced joint utilization. However, RL generalizes poorly to slippery and uneven terrains, indicating a sim-to-real and robustness gap. The results highlight a fundamental trade-off and motivate hybrid or domain-randomized approaches to combine robustness with efficiency for practical legged robotics.
Abstract
Model Predictive Control (MPC) and Reinforcement Learning (RL) are two prominent strategies for controlling legged robots, each with unique strengths. RL learns control policies through system interaction, adapting to various scenarios, whereas MPC relies on a predefined mathematical model to solve optimization problems in real-time. Despite their widespread use, there is a lack of direct comparative analysis under standardized conditions. This work addresses this gap by benchmarking MPC and RL controllers on a Unitree Go1 quadruped robot within the MuJoCo simulation environment, focusing on a standardized task-straight walking at a constant velocity. Performance is evaluated based on disturbance rejection, energy efficiency, and terrain adaptability. The results show that RL excels in handling disturbances and maintaining energy efficiency but struggles with generalization to new terrains due to its dependence on learned policies tailored to specific environments. In contrast, MPC shows enhanced recovery capabilities from larger perturbations by leveraging its optimization-based approach, allowing for a balanced distribution of control efforts across the robot's joints. The results provide a clear understanding of the advantages and limitations of both RL and MPC, offering insights into selecting an appropriate control strategy for legged robotic applications.
