Table of Contents
Fetching ...

SIT-LMPC: Safe Information-Theoretic Learning Model Predictive Control for Iterative Tasks

Zirui Zang, Ahmad Amine, Nick-Marios T. Kokolakis, Truong X. Nghiem, Ugo Rosolia, Rahul Mangharam

TL;DR

SIT-LMPC addresses safe, high-performance control for iterative tasks under uncertainty by extending LMPC to stochastic nonlinear systems and solving constrained optimization via a constrained MPPI with online adaptive penalties. Value function estimation uses normalizing flows to capture rich uncertainty, enabling safer trajectories, while parallel GPU execution yields real-time performance. Across point-mass, simulated autonomous racing, and real 1/5-scale vehicle experiments, SIT-LMPC achieves faster convergence and improved safety compared with LMPC and ABC-LMPC. This approach offers a scalable, data-driven framework for safe iterative learning in robotics with practical impact on real-time control under uncertainty.

Abstract

Robots executing iterative tasks in complex, uncertain environments require control strategies that balance robustness, safety, and high performance. This paper introduces a safe information-theoretic learning model predictive control (SIT-LMPC) algorithm for iterative tasks. Specifically, we design an iterative control framework based on an information-theoretic model predictive control algorithm to address a constrained infinite-horizon optimal control problem for discrete-time nonlinear stochastic systems. An adaptive penalty method is developed to ensure safety while balancing optimality. Trajectories from previous iterations are utilized to learn a value function using normalizing flows, which enables richer uncertainty modeling compared to Gaussian priors. SIT-LMPC is designed for highly parallel execution on graphics processing units, allowing efficient real-time optimization. Benchmark simulations and hardware experiments demonstrate that SIT-LMPC iteratively improves system performance while robustly satisfying system constraints.

SIT-LMPC: Safe Information-Theoretic Learning Model Predictive Control for Iterative Tasks

TL;DR

SIT-LMPC addresses safe, high-performance control for iterative tasks under uncertainty by extending LMPC to stochastic nonlinear systems and solving constrained optimization via a constrained MPPI with online adaptive penalties. Value function estimation uses normalizing flows to capture rich uncertainty, enabling safer trajectories, while parallel GPU execution yields real-time performance. Across point-mass, simulated autonomous racing, and real 1/5-scale vehicle experiments, SIT-LMPC achieves faster convergence and improved safety compared with LMPC and ABC-LMPC. This approach offers a scalable, data-driven framework for safe iterative learning in robotics with practical impact on real-time control under uncertainty.

Abstract

Robots executing iterative tasks in complex, uncertain environments require control strategies that balance robustness, safety, and high performance. This paper introduces a safe information-theoretic learning model predictive control (SIT-LMPC) algorithm for iterative tasks. Specifically, we design an iterative control framework based on an information-theoretic model predictive control algorithm to address a constrained infinite-horizon optimal control problem for discrete-time nonlinear stochastic systems. An adaptive penalty method is developed to ensure safety while balancing optimality. Trajectories from previous iterations are utilized to learn a value function using normalizing flows, which enables richer uncertainty modeling compared to Gaussian priors. SIT-LMPC is designed for highly parallel execution on graphics processing units, allowing efficient real-time optimization. Benchmark simulations and hardware experiments demonstrate that SIT-LMPC iteratively improves system performance while robustly satisfying system constraints.
Paper Structure (13 sections, 15 equations, 5 figures, 2 algorithms)

This paper contains 13 sections, 15 equations, 5 figures, 2 algorithms.

Figures (5)

  • Figure 1: SIT-LMPC architecture: starting from an initial trajectory, the algorithm iteratively updates the safe set and value function model (orange loop), while solving multiple MPPI problems in parallel (blue loop) to generate optimal trajectories.
  • Figure 2: Point mass experiment. (a) Convergence of lap time over iterations. (b) Layout of the experiment.
  • Figure 3: Top: Convergence of lap time for three simulated experiments (five independent rollouts in light color and averages in dark color; $\times$ denotes out-of-track). Bottom: Fastest lap trajectories with velocity colorbar.
  • Figure 4: Ablation study of key SIT-LMPC components for a CEM controller (left) and an MPPI controller (right), comparing NF and BNN value function models, with and without the AP method. Plotted are averages of five rollouts with $\times$ denoting out-of-track.
  • Figure 5: Experimental setup and results with a real vehicle. (a) Platform. (b) Track. (c) Laptime per iteration. (d) Boundary violations.

Theorems & Definitions (2)

  • Remark 1
  • Remark 2