MPC4RL -- A Software Package for Reinforcement Learning based on Model Predictive Control
Dirk Reinhardt, Katrin Baumgärnter, Jonathan Frey, Moritz Diehl, Sebastien Gros
TL;DR
The paper addresses the lack of open-source tools for reinforcement-learning-based MPC and introduces MPC4RL, an open-source Python package that links acados with Gymnasium and stable-baselines3 to enable learning-enabled MPC. It extends acados to provide parametric NLP sensitivities, enabling efficient computation of $\nabla_\theta V_\theta(s)$ and $\nabla_\theta Q_\theta(s,a)$ needed by RL methods that use MPC as a function approximator. The authors demonstrate that policy-gradient evaluations via these sensitivities are about an order of magnitude faster than general-purpose approaches, as shown in two case studies. The work is modular and extensible, released on GitHub, with plans for parallel sensitivity evaluation, warm starting, and broader RL algorithm support.
Abstract
In this paper, we present an early software integrating Reinforcement Learning (RL) with Model Predictive Control (MPC). Our aim is to make recent theoretical contributions from the literature more accessible to both the RL and MPC communities. We combine standard software tools developed by the RL community, such as Gymnasium, stable-baselines3, or CleanRL with the acados toolbox, a widely-used software package for efficient MPC algorithms. Our core contribution is MPC4RL, an open-source Python package that supports learning-enhanced MPC schemes for existing acados implementations. The package is designed to be modular, extensible, and user-friendly, facilitating the tuning of MPC algorithms for a broad range of control problems. It is available on GitHub.
