Model Predictive Control via Probabilistic Inference: A Tutorial
Kohei Honda
TL;DR
This paper surveys probabilistic inference-based MPC as a unified, sampling-based approach to finite-horizon control in robotics. It derives the optimal control distribution $\pi^*(\mathbf{u}_{0:T-1}) = Z^{-1}\exp(-\lambda^{-1} J_{\tau}(\mathbf{u}_{0:T-1}))\, p(\mathbf{u}_{0:T-1})$, interprets it via a Boltzmann reasoner and a prior over actions, and analyzes how the temperature $\lambda$ and priors shape behavior and sample efficiency. The MPPI algorithm is presented as a practical instantiation, approximating $\pi^*$ with a Gaussian variational family and computing the mean action via a softmax-weighted average over Monte Carlo samples; its steps are embarrassingly parallelizable and compatible with modern differentiable tooling. The article also discusses tuning strategies, exploration/diversity considerations, theoretical insights, and links to diffusion models, offering a broad, implementation-focused guide for researchers and practitioners. Overall, probabilistic inference-based MPC provides a robust, scalable framework for handling arbitrary dynamics and costs in robotics and beyond, balancing optimality, exploration, and real-time feasibility.
Abstract
Model Predictive Control (MPC) is a fundamental framework for optimizing robot behavior over a finite future horizon. While conventional numerical optimization methods can efficiently handle simple dynamics and cost structures, they often become intractable for the nonlinear or non-differentiable systems commonly encountered in robotics. This article provides a tutorial on probabilistic inference-based MPC, presenting a unified theoretical foundation and a comprehensive overview of representative methods. Probabilistic inference-based MPC approaches, such as Model Predictive Path Integral (MPPI) control, have gained significant attention by reinterpreting optimal control as a problem of probabilistic inference. Rather than relying on gradient-based numerical optimization, these methods estimate optimal control distributions through sampling-based techniques, accommodating arbitrary cost functions and dynamics. We first derive the optimal control distribution from the standard optimal control problem, elucidating its probabilistic interpretation and key characteristics. The widely used MPPI algorithm is then derived as a practical example, followed by discussions on prior and variational distribution design, tuning principles, and theoretical aspects. This article aims to serve as a systematic guide for researchers and practitioners seeking to understand, implement, and extend these methods in robotics and beyond.
