Stein Variational Guided Model Predictive Path Integral Control: Proposal and Experiments with Fast Maneuvering Vehicles

Kohei Honda; Naoki Akai; Kosuke Suzuki; Mizuho Aoki; Hirotaka Hosogaya; Hiroyuki Okuda; Tatsuya Suzuki

Stein Variational Guided Model Predictive Path Integral Control: Proposal and Experiments with Fast Maneuvering Vehicles

Kohei Honda, Naoki Akai, Kosuke Suzuki, Mizuho Aoki, Hirotaka Hosogaya, Hiroyuki Okuda, Tatsuya Suzuki

TL;DR

A novel Stochastic Optimal Control method based on Model Predictive Path Integral control (MPPI), named Stein Variational Guided MPPI (SVG-MPPI), designed to handle rapidly shifting multimodal optimal action distributions, outperforms both the original MPPI and other state-of-the-art sampling-based SOC algorithms in terms of path-tracking and obstacle-avoidance capabilities.

Abstract

This paper presents a novel Stochastic Optimal Control (SOC) method based on Model Predictive Path Integral control (MPPI), named Stein Variational Guided MPPI (SVG-MPPI), designed to handle rapidly shifting multimodal optimal action distributions. While MPPI can find a Gaussian-approximated optimal action distribution in closed form, i.e., without iterative solution updates, it struggles with the multimodality of the optimal distributions. This is due to the less representative nature of the Gaussian. To overcome this limitation, our method aims to identify a target mode of the optimal distribution and guide the solution to converge to fit it. In the proposed method, the target mode is roughly estimated using a modified Stein Variational Gradient Descent (SVGD) method and embedded into the MPPI algorithm to find a closed-form "mode-seeking" solution that covers only the target mode, thus preserving the fast convergence property of MPPI. Our simulation and real-world experimental results demonstrate that SVG-MPPI outperforms both the original MPPI and other state-of-the-art sampling-based SOC algorithms in terms of path-tracking and obstacle-avoidance capabilities. Source code: https://github.com/kohonda/proj-svg_mppi

Stein Variational Guided Model Predictive Path Integral Control: Proposal and Experiments with Fast Maneuvering Vehicles

TL;DR

Abstract

Paper Structure (28 sections, 12 equations, 6 figures, 2 tables)

This paper contains 28 sections, 12 equations, 6 figures, 2 tables.

INTRODUCTION
RELATED WORK
Increasing Effective Samples
Approximating Multimodal Distribution
Finding a Mode-Seeking Solution
REVIEW OF MPPI
Review of the MPPI Theory
Problem Formulation
Analytical PDF of the Optimal Action Distribution
Forward KL Divergence Mimimization
An Open Issue of MPPI
Asymmetry Property of KL divergence
STEIN VARIATIONAL GUIDED MPPI
Transport Guide Particles by the Modified SVGD method
FKL divergence Minimization with the Nominal Sequence and Adaptive Covariance Matrix Sequence
...and 13 more sections

Figures (6)

Figure 1: An open issue of MPPI. The MPPI algorithm minimizes the Forward KL divergence instead of the original stochastic optimal control problem. As a result, the estimated action distribution $\mathbb{Q}$ may cover the multiple modes of the optimal action distribution $\mathbb{Q^*}$, and it may lead to finding the collision trajectory as the optimized one.
Figure 2: Path tracking scenario
Figure 3: Obstacle avoidance scenario
Figure 5: Asymmetry properties of the KL divergence kobayashi2022real. When minimizing the Forward KL (FKL) divergence, it results in covering multiple modes within the optimal action distribution. This is because FKL divergence imposes a significant penalty when $q^* > 0$ and $q \approx 0$. In contrast, minimizing the Reverse KL (RKL) divergence converges to a single mode in the optimal distribution, a property known as mode-seeking. This happens because RKL divergence takes on a larger value when $q > 0$ and $q^* \approx 0$.
Figure 6: Overview of the proposed method. (a) SVG-MPPI first transports guide particles by minimizing the RKL divergence using the modified SVGD method. These transport trajectories of the guide particles are employed to roughly estimate a convergent target mode. (b) We then identify the peak of the target mode by simply taking the lowest sequence cost guide particle as a nominal sequence. (c) The variances are also roughly estimated by the Gaussian fitting algorithm with the transport trajectory. (d) Finally, SVG-MPPI minimizes the FKL divergence, incorporating the nominal sequence and adaptive covariance matrices in the process. The SVG-MPPI can efficiently find a mode-seeking action distribution compared to the original MPPI (Fig. \ref{['fig:mppi_issue']}).
...and 1 more figures

Stein Variational Guided Model Predictive Path Integral Control: Proposal and Experiments with Fast Maneuvering Vehicles

TL;DR

Abstract

Stein Variational Guided Model Predictive Path Integral Control: Proposal and Experiments with Fast Maneuvering Vehicles

Authors

TL;DR

Abstract

Table of Contents

Figures (6)