Table of Contents
Fetching ...

Model Predictive Path Integral Control as Preconditioned Gradient Descent

Mahyar Fazlyab, Sina Sharifi, Jiarui Wang

Abstract

Model Predictive Path Integral (MPPI) control is a popular sampling-based method for trajectory optimization in nonlinear and nonconvex settings, yet its optimization structure remains only partially understood. We develop a variational, optimization-theoretic interpretation of MPPI by lifting constrained trajectory optimization to a KL-regularized problem over distributions and reducing it to a negative log-partition (free-energy) objective over a tractable sampling family. For a general parametric family, this yields a preconditioned gradient method on the distribution parameters and a natural multi-step extension of MPPI. For the fixed-covariance Gaussian family, we show that classical MPPI is recovered exactly as a preconditioned gradient descent step with unit step size. This interpretation enables a direct convergence analysis: under bounded feasible sets, we derive an explicit upper bound on the smoothness constant and a simple sufficient condition guaranteeing descent of exact MPPI. Numerical experiments support the theory and illustrate the effect of key hyperparameters on performance.

Model Predictive Path Integral Control as Preconditioned Gradient Descent

Abstract

Model Predictive Path Integral (MPPI) control is a popular sampling-based method for trajectory optimization in nonlinear and nonconvex settings, yet its optimization structure remains only partially understood. We develop a variational, optimization-theoretic interpretation of MPPI by lifting constrained trajectory optimization to a KL-regularized problem over distributions and reducing it to a negative log-partition (free-energy) objective over a tractable sampling family. For a general parametric family, this yields a preconditioned gradient method on the distribution parameters and a natural multi-step extension of MPPI. For the fixed-covariance Gaussian family, we show that classical MPPI is recovered exactly as a preconditioned gradient descent step with unit step size. This interpretation enables a direct convergence analysis: under bounded feasible sets, we derive an explicit upper bound on the smoothness constant and a simple sufficient condition guaranteeing descent of exact MPPI. Numerical experiments support the theory and illustrate the effect of key hyperparameters on performance.

Paper Structure

This paper contains 19 sections, 4 theorems, 52 equations, 2 figures, 1 table, 1 algorithm.

Key Result

Lemma 1

Under Assumption ass:regularity_parametric, where In addition,

Figures (2)

  • Figure 1: Ablation study of the parameters $\eta$ (top left), $\Sigma$ (top right), and $\tau$ (bottom left), and comparison with gradient descent and finite differences (bottom right) on the LQR benchmark.
  • Figure 2: Comparison of the trajectories chosen by MPPI and 10-step MPPI on the Dubins car benchmark in a cluttered environment.

Theorems & Definitions (7)

  • Lemma 1: Gradient and Hessian Representations
  • proof
  • Lemma 2
  • Theorem 1
  • proof
  • Theorem 2
  • proof