Table of Contents
Fetching ...

Path Integral Control with Rollout Clustering and Dynamic Obstacles

Steven Patrick, Efstathios Bakolas

TL;DR

This paper addresses MPPI's vulnerability to unsafe trajectory averages and its lack of dynamic obstacle handling. It introduces Rollout Clustering using DBSCAN to partition trajectory samples and a truncated Gaussian importance sampling within clusters, improving safety without substantial overhead. It also proposes a dynamic-obstacle cost framework that augments running and terminal costs with simulated obstacle trajectories, enabling additive computation for dynamic environments. Empirically, the methods reduce collisions and failures in both static and dynamic obstacle scenarios, with modest increases in computation time, making MPPI more robust for real-time autonomous navigation in uncertain settings.

Abstract

Model Predictive Path Integral (MPPI) control has proven to be a powerful tool for the control of uncertain systems (such as systems subject to disturbances and systems with unmodeled dynamics). One important limitation of the baseline MPPI algorithm is that it does not utilize simulated trajectories to their fullest extent. For one, it assumes that the average of all trajectories weighted by their performance index will be a safe trajectory. In this paper, multiple examples are shown where the previous assumption does not hold, and a trajectory clustering technique is presented that reduces the chances of the weighted average crossing in an unsafe region. Secondly, MPPI does not account for dynamic obstacles, so the authors put forward a novel cost function that accounts for dynamic obstacles without adding significant computation time to the overall algorithm. The novel contributions proposed in this paper were evaluated with extensive simulations to demonstrate improvements upon the state-of-the-art MPPI techniques.

Path Integral Control with Rollout Clustering and Dynamic Obstacles

TL;DR

This paper addresses MPPI's vulnerability to unsafe trajectory averages and its lack of dynamic obstacle handling. It introduces Rollout Clustering using DBSCAN to partition trajectory samples and a truncated Gaussian importance sampling within clusters, improving safety without substantial overhead. It also proposes a dynamic-obstacle cost framework that augments running and terminal costs with simulated obstacle trajectories, enabling additive computation for dynamic environments. Empirically, the methods reduce collisions and failures in both static and dynamic obstacle scenarios, with modest increases in computation time, making MPPI more robust for real-time autonomous navigation in uncertain settings.

Abstract

Model Predictive Path Integral (MPPI) control has proven to be a powerful tool for the control of uncertain systems (such as systems subject to disturbances and systems with unmodeled dynamics). One important limitation of the baseline MPPI algorithm is that it does not utilize simulated trajectories to their fullest extent. For one, it assumes that the average of all trajectories weighted by their performance index will be a safe trajectory. In this paper, multiple examples are shown where the previous assumption does not hold, and a trajectory clustering technique is presented that reduces the chances of the weighted average crossing in an unsafe region. Secondly, MPPI does not account for dynamic obstacles, so the authors put forward a novel cost function that accounts for dynamic obstacles without adding significant computation time to the overall algorithm. The novel contributions proposed in this paper were evaluated with extensive simulations to demonstrate improvements upon the state-of-the-art MPPI techniques.
Paper Structure (10 sections, 27 equations, 5 figures, 1 table, 3 algorithms)

This paper contains 10 sections, 27 equations, 5 figures, 1 table, 3 algorithms.

Figures (5)

  • Figure 1: Example of standard MPPI failing. Trajectories with low cost, and therefore high value, are separated by a region with high cost trajectory. The resulting weighted average used by MPPI is in the high cost, low value, region.
  • Figure 2: Example of standard MPPI (left) producing an undesirable result and Clustered MPPI (right) producing multiple valid solutions. The blue solid line is the cost function, the X marks are sampled points, and the filled in stars are the result of the weighted average.
  • Figure 3: Random forest environment example. The agent is the magenta dot, the goal is the green star, and the obstacles are light and dark red.
  • Figure 4: Dynamic obstacle example where the agent is a magenta circle, the obstacle is a red circle traveling in the negative $x$ direction, and the goal is the green star. Top left are all the simulated trajectories of the agent. Top middle are the simulated obstacle trajectories. Top right are the resulting paths from each algorithm. The bottom row are the plots of the value functions with respect to the deviation from the reference control input. Different clusters of trajectories are denoted in different colors. The resulting control input deviation is denoted with a yellow star.
  • Figure 5: Dynamic environment example. Agent is magenta circle, obstacles are red, green and blue circles with their direction of motion indicated by black arrows. The goal is the green star. The yellow and cyan lines are the first iteration of DC-MPPI with the yellow being the reference input and the cyan being the result.

Theorems & Definitions (2)

  • proof
  • proof