Optimality and Suboptimality of MPPI Control in Stochastic and Deterministic Settings
Hannes Homburger, Florian Messerer, Moritz Diehl, Johannes Reuter
TL;DR
The paper analyzes Model Predictive Path Integral (MPPI) control from a stochastic optimal control perspective, showing that standard MPPI solves three related OCPs—CLS-OCP, OLS-OCP, and DET-OCP—under an input-noise interpretation. It introduces a beta-scaling parameter that tunes sampling uncertainty and proves that, in smooth unconstrained problems, the suboptimality of the MPPI control scales as $O(\beta^2)$ while the suboptimality of the value function scales as $O(\beta^4)$. The authors derive a Laplace-method-based convergence result for the deterministic case, showing that MPPI trajectories converge to the true DET-OCP solution as $\beta \to 0$. They also discuss computational aspects, including Monte Carlo sampling, importance sampling corrections, and GPU-friendly implementations, and illustrate the theory with numerical experiments. These results provide principled guidance for tuning MPPI hyperparameters in robotics and reinforcement learning applications.
Abstract
Model predictive path integral (MPPI) control has recently received a lot of attention, especially in the robotics and reinforcement learning communities. This letter aims to make the MPPI control framework more accessible to the optimal control community. We present three classes of optimal control problems and their solutions by MPPI. Further, we investigate the suboptimality of MPPI to general deterministic nonlinear discrete-time systems. Here, suboptimality is defined as the deviation between the control provided by MPPI and the optimal solution to the deterministic optimal control problem. Our findings are that in a smooth and unconstrained setting, the growth of suboptimality in the control input trajectory is second-order with the scaling of uncertainty. The results indicate that the suboptimality of the MPPI solution can be modulated by appropriately tuning the hyperparameters. We illustrate our findings using numerical examples.
