Towards Efficient MPPI Trajectory Generation with Unscented Guidance: U-MPPI Control Strategy

Ihab S. Mohamed; Junhong Xu; Gaurav S Sukhatme; Lantao Liu

Towards Efficient MPPI Trajectory Generation with Unscented Guidance: U-MPPI Control Strategy

Ihab S. Mohamed, Junhong Xu, Gaurav S Sukhatme, Lantao Liu

TL;DR

This work addresses safety gaps in the Model Predictive Path Integral (MPPI) controller by introducing U-MPPI, which uses the Unscented Transform to propagate both the mean and covariance of the state and couples this with a risk-sensitive cost to explicitly account for uncertainty. The method provides a richer, state-dependent sampling strategy and a tunable risk attitude via the parameter $\gamma$, enabling safer, more robust trajectory optimization in uncertain environments. Through extensive simulations on 2D aggressive navigation tasks in known and unknown clutter, plus real-world demonstrations with a Jackal robot in an unknown corridor, U-MPPI demonstrates higher task completion and success rates, reduced local minima, and zero collisions in challenging scenarios, while maintaining real-time performance on GPUs. These results suggest that U-MPPI offers a practical and scalable approach for reliable autonomous navigation under uncertainty, with potential extensions to higher-dimensional systems and moving obstacles via chance constraints.

Abstract

The classical Model Predictive Path Integral (MPPI) control framework, while effective in many applications, lacks reliable safety features due to its reliance on a risk-neutral trajectory evaluation technique, which can present challenges for safety-critical applications such as autonomous driving. Furthermore, when the majority of MPPI sampled trajectories concentrate in high-cost regions, it may generate an infeasible control sequence. To address this challenge, we propose the U-MPPI control strategy, a novel methodology that can effectively manage system uncertainties while integrating a more efficient trajectory sampling strategy. The core concept is to leverage the Unscented Transform (UT) to propagate not only the mean but also the covariance of the system dynamics, going beyond the traditional MPPI method. As a result, it introduces a novel and more efficient trajectory sampling strategy, significantly enhancing state-space exploration and ultimately reducing the risk of being trapped in local minima. Furthermore, by leveraging the uncertainty information provided by UT, we incorporate a risk-sensitive cost function that explicitly accounts for risk or uncertainty throughout the trajectory evaluation process, resulting in a more resilient control system capable of handling uncertain conditions. By conducting extensive simulations of 2D aggressive autonomous navigation in both known and unknown cluttered environments, we verify the efficiency and robustness of our proposed U-MPPI control strategy compared to the baseline MPPI. We further validate the practicality of U-MPPI through real-world demonstrations in unknown cluttered environments, showcasing its superior ability to incorporate both the UT and local costmap into the optimization problem without introducing additional complexity.

Towards Efficient MPPI Trajectory Generation with Unscented Guidance: U-MPPI Control Strategy

TL;DR

, enabling safer, more robust trajectory optimization in uncertain environments. Through extensive simulations on 2D aggressive navigation tasks in known and unknown clutter, plus real-world demonstrations with a Jackal robot in an unknown corridor, U-MPPI demonstrates higher task completion and success rates, reduced local minima, and zero collisions in challenging scenarios, while maintaining real-time performance on GPUs. These results suggest that U-MPPI offers a practical and scalable approach for reliable autonomous navigation under uncertainty, with potential extensions to higher-dimensional systems and moving obstacles via chance constraints.

Abstract

Paper Structure (36 sections, 38 equations, 11 figures, 4 tables, 1 algorithm)

This paper contains 36 sections, 38 equations, 11 figures, 4 tables, 1 algorithm.

Introduction and Related Work
Stochastic Optimal Control
Problem Formulation
Overview of MPPI Control Strategy
Unscented Optimal Control
Unscented Transform
Unscented Optimal Control
U-MPPI Control Strategy
Unscented-Based Sampling Strategy
Risk-Sensitive Cost
Real-Time U-MPPI Control Algorithm
Constraint Handling and Scalability in U-MPPI
Constraint Handling for MPPI Variants
Scalability and Adaptability of U-MPPI
Simulation-Based Evaluation
...and 21 more sections

Figures (11)

Figure 1: Our proposed sampling strategy, for a ground vehicle model, under the U-MPPI control strategy based on unscented transform; such a sampling strategy propagates both the mean $\bar{\mathbf{x}}_k$ (blue dots) and covariance $\mathbf{\Sigma}_k$ (gray ellipses) of the state vector at each time-step $k$; to generate $M$ sampled trajectories, we propagate $M_\sigma$ sets of batches, where each batch contains $n_\sigma$ trajectories corresponding to the $n_\sigma$ sigma points, where $M=n_\sigma M_\sigma$, $n_\sigma = 2n_x +1$, and red lines refer to $2n_x$ sigma-point trajectories surrounding the nominal trajectories (blue lines); for our validation-used robot, $n_x=3$.
Figure 2: Schematic illustration of system dynamics propagation in MPPI for $M$ sampled trajectories over a finite time-horizon $N$.
Figure 3: Schematic illustration of nonlinear dynamical system propagation under the proposed U-MPPI control strategy for $M$ sampled trajectories over a finite time-horizon $N$, where $M=n_{\sigma} M_\sigma$.
Figure 4: Distribution of $210$ sampled trajectories generated by (a) MPPI with $\delta \mathbf{u}_{k} \sim \mathcal{N}(\mathbf{0}, 0.025\mathbf{I}_2)$ and (b) U-MPPI with the same perturbation in the control input $\delta \mathbf{u}_{k}$ but with different UT parameters; in both methods, the robot is assumed to be initially located at $\mathbf{x} = [{x}, {y}, \theta]^{\top}= [0,0,0]^{\top}$ in ([m], [m], []), with a commanded control input $\mathbf{u} = [v,\omega]^{\top} = [1, 0]^{\top}$ in ([m/s], [ rad/s]).
Figure 5: Influence of different $\gamma$ values on the penalty coefficients $Q_{\mathrm{rs}}$ for state reference tracking with $Q = 2$ and varying 1-dimensional state covariance $\mathbf{\Sigma}$.
...and 6 more figures

Towards Efficient MPPI Trajectory Generation with Unscented Guidance: U-MPPI Control Strategy

TL;DR

Abstract

Towards Efficient MPPI Trajectory Generation with Unscented Guidance: U-MPPI Control Strategy

Authors

TL;DR

Abstract

Table of Contents

Figures (11)