Table of Contents
Fetching ...

Improved Sample Complexity of Imitation Learning for Barrier Model Predictive Control

Daniel Pfrommer, Swati Padmanabhan, Kwangjun Ahn, Jack Umenberger, Tobia Marcucci, Zakaria Mhammedi, Ali Jadbabaie

TL;DR

Improving upon the previous work, it is shown that barrier MPC achieves theoretically optimal error-to-smoothness tradeoff along some direction and an improved lower bound is proved on the optimality gap of the analytic center associated with a convex Lipschitz function, which could be of independent interest.

Abstract

Recent work in imitation learning has shown that having an expert controller that is both suitably smooth and stable enables stronger guarantees on the performance of the learned controller. However, constructing such smoothed expert controllers for arbitrary systems remains challenging, especially in the presence of input and state constraints. As our primary contribution, we show how such a smoothed expert can be designed for a general class of systems using a log-barrier-based relaxation of a standard Model Predictive Control (MPC) optimization problem. Improving upon our previous work, we show that barrier MPC achieves theoretically optimal error-to-smoothness tradeoff along some direction. At the core of this theoretical guarantee on smoothness is an improved lower bound we prove on the optimality gap of the analytic center associated with a convex Lipschitz function, which we believe could be of independent interest. We validate our theoretical findings via experiments, demonstrating the merits of our smoothing approach over randomized smoothing.

Improved Sample Complexity of Imitation Learning for Barrier Model Predictive Control

TL;DR

Improving upon the previous work, it is shown that barrier MPC achieves theoretically optimal error-to-smoothness tradeoff along some direction and an improved lower bound is proved on the optimality gap of the analytic center associated with a convex Lipschitz function, which could be of independent interest.

Abstract

Recent work in imitation learning has shown that having an expert controller that is both suitably smooth and stable enables stronger guarantees on the performance of the learned controller. However, constructing such smoothed expert controllers for arbitrary systems remains challenging, especially in the presence of input and state constraints. As our primary contribution, we show how such a smoothed expert can be designed for a general class of systems using a log-barrier-based relaxation of a standard Model Predictive Control (MPC) optimization problem. Improving upon our previous work, we show that barrier MPC achieves theoretically optimal error-to-smoothness tradeoff along some direction. At the core of this theoretical guarantee on smoothness is an improved lower bound we prove on the optimality gap of the analytic center associated with a convex Lipschitz function, which we believe could be of independent interest. We validate our theoretical findings via experiments, demonstrating the merits of our smoothing approach over randomized smoothing.
Paper Structure (25 sections, 25 theorems, 154 equations, 3 figures)

This paper contains 25 sections, 25 theorems, 154 equations, 3 figures.

Key Result

Lemma 2.2

Given a feasible initial state ${x}$, let $\sigma({x}) \in \{0,1\}^m$ denote the indicator of active constraints of the optimizer of eq:reformulated, with $\sigma_i({x})= 1$ iff the $i$th constraint is active. For $\boldsymbol{\sigma} \in \{0,1\}^m$, let $P_{\boldsymbol{\sigma}} = \{ {x} | \sigma({x

Figures (3)

  • Figure 1: The explicit MPC controller for $A = 1101, B = 01, Q = I, R = 0.01, T=10$ with the constraints $\|x\|_\infty \leq 10, |u|\leq 1$. For this simple 2-dimensional system there are $261$$K_\sigma$. This figure appeared in our previous work pfrommer2024sample.
  • Figure 2: Visualizations of the log-barrier MPC control policy and several trajectories for the same system as \ref{['fig:explicit_mpc']} and different choices of $\eta$. This figure appeared in our previous work pfrommer2024sample.
  • Figure 3: Left: The imitation error $\max_{t} \|\hat{x} - x^\star\|$ for the trained MLP over 5 seeds, as a function of the expert smoothness for both randomized smoothing and log-barrier MPC. Center, Right: The $L_0$ (gradient norm) and $L_1$ (hessian norm) smoothness of $\pi^\star$ as a function of the smoothing parameter. This figure appeared in our previous work pfrommer2024sample.

Theorems & Definitions (60)

  • Lemma 2.2: bemporad2002explicit
  • Definition 3.2: Local Input-to-State Stability with Linear Gain, cf. pfrommer2022tasil
  • Definition 3.4: pfrommer2022tasil
  • Definition 3.7: $\epsilon$-Smoothing Algorithm
  • Definition 3.8: Smoothing Algorithm
  • Lemma 3.9: kornowski2021oracle, Lemma 30
  • proof
  • Lemma 3.10
  • proof
  • Definition 3.11
  • ...and 50 more