Table of Contents
Fetching ...

On the Sample Complexity of Imitation Learning for Smoothed Model Predictive Control

Daniel Pfrommer, Swati Padmanabhan, Kwangjun Ahn, Jack Umenberger, Tobia Marcucci, Zakaria Mhammedi, Ali Jadbabaie

TL;DR

The paper tackles imitation learning when the expert is a constrained MPC, aiming to provide end-to-end guarantees by smoothing the expert. It introduces barrier MPC, a log-barrier relaxation of the MPC problem, and proves that the smoothed solution is differentiable with a bounded Hessian, with the Jacobian expressible as a convex combination of affine pieces, and the suboptimality gap bounded by $O(\sqrt{\eta})$. A key theoretical contribution is a lower bound on the residuals in a barrier-augmented convex program, which underpins the smoothness guarantees. Empirically, barrier MPC outperforms randomized smoothing on a toy double integrator, achieving better imitation accuracy and smoother learned policies, thereby offering a faster, constraint-preserving smoothing alternative with stability assurances.

Abstract

Recent work in imitation learning has shown that having an expert controller that is both suitably smooth and stable enables stronger guarantees on the performance of the learned controller. However, constructing such smoothed expert controllers for arbitrary systems remains challenging, especially in the presence of input and state constraints. As our primary contribution, we show how such a smoothed expert can be designed for a general class of systems using a log-barrier-based relaxation of a standard Model Predictive Control (MPC) optimization problem. At the crux of this theoretical guarantee on smoothness is a new lower bound we prove on the optimality gap of the analytic center associated with a convex Lipschitz function, which we hope could be of independent interest. We validate our theoretical findings via experiments, demonstrating the merits of our smoothing approach over randomized smoothing.

On the Sample Complexity of Imitation Learning for Smoothed Model Predictive Control

TL;DR

The paper tackles imitation learning when the expert is a constrained MPC, aiming to provide end-to-end guarantees by smoothing the expert. It introduces barrier MPC, a log-barrier relaxation of the MPC problem, and proves that the smoothed solution is differentiable with a bounded Hessian, with the Jacobian expressible as a convex combination of affine pieces, and the suboptimality gap bounded by . A key theoretical contribution is a lower bound on the residuals in a barrier-augmented convex program, which underpins the smoothness guarantees. Empirically, barrier MPC outperforms randomized smoothing on a toy double integrator, achieving better imitation accuracy and smoother learned policies, thereby offering a faster, constraint-preserving smoothing alternative with stability assurances.

Abstract

Recent work in imitation learning has shown that having an expert controller that is both suitably smooth and stable enables stronger guarantees on the performance of the learned controller. However, constructing such smoothed expert controllers for arbitrary systems remains challenging, especially in the presence of input and state constraints. As our primary contribution, we show how such a smoothed expert can be designed for a general class of systems using a log-barrier-based relaxation of a standard Model Predictive Control (MPC) optimization problem. At the crux of this theoretical guarantee on smoothness is a new lower bound we prove on the optimality gap of the analytic center associated with a convex Lipschitz function, which we hope could be of independent interest. We validate our theoretical findings via experiments, demonstrating the merits of our smoothing approach over randomized smoothing.
Paper Structure (10 sections, 8 theorems, 51 equations, 3 figures)

This paper contains 10 sections, 8 theorems, 51 equations, 3 figures.

Key Result

Theorem 4.3

Suppose that ${u}_{\eta}$ and $u^\star$ are, respectively, the optimizers of def:barr_mpc_formal and eq:reformulated. Then we have the following bound in terms of $\eta$ in eq:barrierMPC:

Figures (3)

  • Figure 1: The explicit MPC controller for $A = 1101, B = 01, Q = I, R = 0.01, T=10$ with the constraints $\|x\|_\infty \leq 10, |u|\leq 1$.
  • Figure 2: Visualizations of the log-barrier MPC control policy and several trajectories for the same system as \ref{['fig:explicit_mpc']} and different choices of $\eta$.
  • Figure 3: Left: The imitation error $\max_{t} \|\hat{x} - x^\star\|$ for the trained MLP over 5 seeds, as a function of the expert smoothness for both randomized smoothing and log-barrier MPC. Center, Right: The $L_0$ (gradient norm) and $L_1$ (Hessian norm) smoothness of $\pi^\star$ as a function of the smoothing parameter.

Theorems & Definitions (21)

  • Definition 3.2: Local Incremental Input-to-State Stability, cf. pfrommer2022tasil
  • Definition 3.4: Smoothness
  • Definition 3.7: Randomized Smoothed MPC
  • Definition 4.1: nesterov1994interior
  • Theorem 4.3
  • proof
  • Lemma 4.4
  • proof
  • Lemma 4.5
  • proof
  • ...and 11 more