Table of Contents
Fetching ...

DM-MPPI: Datamodel for Efficient and Safe Model Path Integral Control

Jiachen Li, Shihao Li, Xu Duan, Dongmei Chen

TL;DR

This work tackles the dual challenges of sample inefficiency and constraint handling in Model Predictive Path Integral (MPPI) control by introducing DM-MPPI, which extends the Datamodels framework to predict how individual trajectory samples influence the control output. An offline phase learns influence coefficients via regression on random subsets, and an online predictor maps cost features to these influences to prune low-impact samples and adapt constraint penalties in real time. A single learned model thus enables both computational efficiency and safety improvements without online regression. Experiments on path tracking with obstacle avoidance show up to a 5× reduction in required samples while maintaining performance and enhancing constraint satisfaction through adaptive penalties.

Abstract

We extend the Datamodels framework from supervised learning to Model Predictive Path Integral (MPPI) control. Whereas Datamodels estimate sample influence via regression on a fixed dataset, we instead learn to predict influence directly from sample cost features, enabling real-time estimation for newly generated samples without online regression. Our influence predictor is trained offline using influence coefficients computed via the Datamodel framework across diverse MPPI instances, and is then deployed online for efficient sample pruning and adaptive constraint handling. A single learned model simultaneously addresses efficiency and safety: low-influence samples are pruned to reduce computational cost, while monitoring the influence of constraint-violating samples enables adaptive penalty tuning. Experiments on path-tracking with obstacle avoidance demonstrate up to a $5\times$ reduction in the number of samples while maintaining control performance and improving constraint satisfaction.

DM-MPPI: Datamodel for Efficient and Safe Model Path Integral Control

TL;DR

This work tackles the dual challenges of sample inefficiency and constraint handling in Model Predictive Path Integral (MPPI) control by introducing DM-MPPI, which extends the Datamodels framework to predict how individual trajectory samples influence the control output. An offline phase learns influence coefficients via regression on random subsets, and an online predictor maps cost features to these influences to prune low-impact samples and adapt constraint penalties in real time. A single learned model thus enables both computational efficiency and safety improvements without online regression. Experiments on path tracking with obstacle avoidance show up to a 5× reduction in required samples while maintaining performance and enhancing constraint satisfaction through adaptive penalties.

Abstract

We extend the Datamodels framework from supervised learning to Model Predictive Path Integral (MPPI) control. Whereas Datamodels estimate sample influence via regression on a fixed dataset, we instead learn to predict influence directly from sample cost features, enabling real-time estimation for newly generated samples without online regression. Our influence predictor is trained offline using influence coefficients computed via the Datamodel framework across diverse MPPI instances, and is then deployed online for efficient sample pruning and adaptive constraint handling. A single learned model simultaneously addresses efficiency and safety: low-influence samples are pruned to reduce computational cost, while monitoring the influence of constraint-violating samples enables adaptive penalty tuning. Experiments on path-tracking with obstacle avoidance demonstrate up to a reduction in the number of samples while maintaining control performance and improving constraint satisfaction.

Paper Structure

This paper contains 17 sections, 24 equations, 3 figures, 1 table, 2 algorithms.

Figures (3)

  • Figure 1: MPPI-Datamodel workflow. Offline phase (green): collect MPPI instances, fit datamodels via LASSO regression, and train influence predictor $h_\phi$. Online phase: predict influence using $h_{\phi^*}$ with cost features and instance statistics, prune low-influence samples for efficiency, and adapt penalty $\rho$ based on violation influence ratio for safety.
  • Figure 2: Sample efficiency comparison between DM-MPPI and standard MPPI across varying sample sizes $K \in \{50, 100, 150, \ldots, 500\}$.
  • Figure 3: Comparison of sampled different trajectories.

Theorems & Definitions (2)

  • Remark 1: Unified Influence for Efficiency and Safety
  • Remark 2: Computational Complexity