Biased-MPPI: Informing Sampling-Based Model Predictive Control by Fusing Ancillary Controllers

Elia Trevisan; Javier Alonso-Mora

Biased-MPPI: Informing Sampling-Based Model Predictive Control by Fusing Ancillary Controllers

Elia Trevisan, Javier Alonso-Mora

TL;DR

This paper tackles sampling-based model predictive control under uncertainty by enabling arbitrary sampling distributions through a bias-aware formulation. It introduces Biased-MPPI, which uses a tilde-cost $\tilde{S}(V) = S(V) + \lambda \log\left( \dfrac{p(V)}{q_s(V)} \right)$ and an optimal, bias-aware distribution, guiding samples via multiple ancillary controllers and online autotuning of $\lambda$. The approach is implemented with both classical and learning-based controllers, forming a control-fusion-style sampling strategy, and validated across simulated pendulum swing-up, intersection crossing, and real-world robot experiments. Results show enhanced robustness, reduced sample requirements, and safer behavior in multi-agent and dynamic scenarios, albeit with potential bias-induced slower trajectories in some cases.

Abstract

Motion planning for autonomous robots in dynamic environments poses numerous challenges due to uncertainties in the robot's dynamics and interaction with other agents. Sampling-based MPC approaches, such as Model Predictive Path Integral (MPPI) control, have shown promise in addressing these complex motion planning problems. However, the performance of MPPI relies heavily on the choice of sampling distribution. Existing literature often uses the previously computed input sequence as the mean of a Gaussian distribution for sampling, leading to potential failures and local minima. In this paper, we propose a novel derivation of MPPI that allows for arbitrary sampling distributions to enhance efficiency, robustness, and convergence while alleviating the problem of local minima. We present an efficient importance sampling scheme that combines classical and learning-based ancillary controllers simultaneously, resulting in more informative sampling and control fusion. Several simulated and real-world demonstrate the validity of our approach.

Biased-MPPI: Informing Sampling-Based Model Predictive Control by Fusing Ancillary Controllers

TL;DR

and an optimal, bias-aware distribution, guiding samples via multiple ancillary controllers and online autotuning of

. The approach is implemented with both classical and learning-based controllers, forming a control-fusion-style sampling strategy, and validated across simulated pendulum swing-up, intersection crossing, and real-world robot experiments. Results show enhanced robustness, reduced sample requirements, and safer behavior in multi-agent and dynamic scenarios, albeit with potential bias-induced slower trajectories in some cases.

Abstract

Paper Structure (31 sections, 24 equations, 8 figures, 2 tables)

This paper contains 31 sections, 24 equations, 8 figures, 2 tables.

Introduction
Previous Work
Contributions
Preliminaries
Proposed Approach
Biased-MPPI
Sampling from Ancillary Controllers
Autotuning the Inverse Temperature
Illustrative Experiment
Swing-up and tracking
Ancillary Controllers
A Linear Quadratic Regulator (LQR)
A Linear Quadratic Integral (LQI)
A nonlinear Energy-Based Controller (EBC)
Switching Controller
...and 16 more sections

Figures (8)

Figure 1: Top: Usually, MPPI only takes samples around a previous plan. Here, the environment changes unexpectedly, and all the sampled trajectories are in collision, which leads to computing a new plan that also collides. Bottom: our Biased-MPPI adds ancillary controllers to the sampling distribution, quickly converging to a collision avoidance maneuver.
Figure 2: Left, Quanser Qube-Servo, and right, its diagram. The arm's rotation, $\theta$, is the actuated angle. The angle between the pendulum and the upright position, $\alpha$, is not actuated.
Figure 3: Input and state evolution during a pendulum experiment with Biased-MPPI. We show the samples taken and the resulting planned input sequence over the planning horizon for three instances. While we sample all ancillary controllers in each instance, we highlight the one with the most influence on the planned input sequence.
Figure 4: Total cost and control effort over 50 pendulum swing-ups with randomized model parameters.
Figure 5: Two vessels cross each other's path while penalized when not giving the right-of-way to agents coming from their right. The large circles are the agents' true local goals extracted from a global path. IA-MPPI is decentralized and communication-free, so the small dots are the goals vessels estimate of one another using constant velocity. The trajectories in blue are those the blue agent has planned for itself and predicted for the other, and the same goes for the orange agent.
...and 3 more figures

Biased-MPPI: Informing Sampling-Based Model Predictive Control by Fusing Ancillary Controllers

TL;DR

Abstract

Biased-MPPI: Informing Sampling-Based Model Predictive Control by Fusing Ancillary Controllers

Authors

TL;DR

Abstract

Table of Contents

Figures (8)