Table of Contents
Fetching ...

Distributionally Robust Imitation Learning: Layered Control Architecture for Certifiable Autonomy

Aditya Gahlawat, Ahmed Aboudonia, Sandeep Banik, Naira Hovakimyan, Nikolai Matni, Aaron D. Ames, Gioele Zardini, Alberto Speranzon

TL;DR

The paper tackles distribution shifts in imitation learning by proposing DRIP, a Layered Control Architecture that unites TaSIL (robustness to policy shifts) with $\mathcal{L}_1$-DRAC (robustness to modeling uncertainty). It decouples policy-induced and uncertainty-induced gaps, deriving tractable bounds: TaSIL yields a trajectory-level imitation gap that scales as $O\left(\frac{\log n}{n}\right)$ with data size, while $\mathcal{L}_1$-DRAC provides a training-free, probabilistic bound largely independent of data size. The resulting total imitation gap is the sum of these certifiable components, enabling a fully certifiable autonomy pipeline that integrates learning with robust control and perception. Numerical experiments illustrate that TaSIL alone may fail under uncertainty, whereas the combined DRIP architecture stabilizes the system and bounds the imitation gap, highlighting practical implications for certifiable autonomous systems.

Abstract

Imitation learning (IL) enables autonomous behavior by learning from expert demonstrations. While more sample-efficient than comparative alternatives like reinforcement learning, IL is sensitive to compounding errors induced by distribution shifts. There are two significant sources of distribution shifts when using IL-based feedback laws on systems: distribution shifts caused by policy error and distribution shifts due to exogenous disturbances and endogenous model errors due to lack of learning. Our previously developed approaches, Taylor Series Imitation Learning (TaSIL) and $\mathcal{L}_1$ -Distributionally Robust Adaptive Control (\ellonedrac), address the challenge of distribution shifts in complementary ways. While TaSIL offers robustness against policy error-induced distribution shifts, \ellonedrac offers robustness against distribution shifts due to aleatoric and epistemic uncertainties. To enable certifiable IL for learned and/or uncertain dynamical systems, we formulate \textit{Distributionally Robust Imitation Policy (DRIP)} architecture, a Layered Control Architecture (LCA) that integrates TaSIL and~\ellonedrac. By judiciously designing individual layer-centric input and output requirements, we show how we can guarantee certificates for the entire control pipeline. Our solution paves the path for designing fully certifiable autonomy pipelines, by integrating learning-based components, such as perception, with certifiable model-based decision-making through the proposed LCA approach.

Distributionally Robust Imitation Learning: Layered Control Architecture for Certifiable Autonomy

TL;DR

The paper tackles distribution shifts in imitation learning by proposing DRIP, a Layered Control Architecture that unites TaSIL (robustness to policy shifts) with -DRAC (robustness to modeling uncertainty). It decouples policy-induced and uncertainty-induced gaps, deriving tractable bounds: TaSIL yields a trajectory-level imitation gap that scales as with data size, while -DRAC provides a training-free, probabilistic bound largely independent of data size. The resulting total imitation gap is the sum of these certifiable components, enabling a fully certifiable autonomy pipeline that integrates learning with robust control and perception. Numerical experiments illustrate that TaSIL alone may fail under uncertainty, whereas the combined DRIP architecture stabilizes the system and bounds the imitation gap, highlighting practical implications for certifiable autonomous systems.

Abstract

Imitation learning (IL) enables autonomous behavior by learning from expert demonstrations. While more sample-efficient than comparative alternatives like reinforcement learning, IL is sensitive to compounding errors induced by distribution shifts. There are two significant sources of distribution shifts when using IL-based feedback laws on systems: distribution shifts caused by policy error and distribution shifts due to exogenous disturbances and endogenous model errors due to lack of learning. Our previously developed approaches, Taylor Series Imitation Learning (TaSIL) and -Distributionally Robust Adaptive Control (\ellonedrac), address the challenge of distribution shifts in complementary ways. While TaSIL offers robustness against policy error-induced distribution shifts, \ellonedrac offers robustness against distribution shifts due to aleatoric and epistemic uncertainties. To enable certifiable IL for learned and/or uncertain dynamical systems, we formulate \textit{Distributionally Robust Imitation Policy (DRIP)} architecture, a Layered Control Architecture (LCA) that integrates TaSIL and~\ellonedrac. By judiciously designing individual layer-centric input and output requirements, we show how we can guarantee certificates for the entire control pipeline. Our solution paves the path for designing fully certifiable autonomy pipelines, by integrating learning-based components, such as perception, with certifiable model-based decision-making through the proposed LCA approach.

Paper Structure

This paper contains 16 sections, 4 theorems, 58 equations, 5 figures.

Key Result

Proposition III.1

Under Assumption assmp:ILF, the nominal (expert) system eqn:ExpertProcess is incrementally input-to-state stable ($\delta$-ISS) angeli2002lyapunov. In particular, for any $\xi_1, \xi_2 \in \mathbb{R}^n$ and $\varsigma \in \mathcal{C}\left([0,T];\mathbb{R}^m\right)$ where $\lambda_\theta = 2 \lambda - \theta \Delta_g^2$, $\theta \in \left(0, 2\lambda / \Delta_g^2 \right)$, and $y_t^\star = x_t^{\pi

Figures (5)

  • Figure 1: Illustration of a layered control architecture that integrates TaSIL and $\mathcal{L}_1$-DRAC, where $X_{t}$ represents the state of the system. In this architecture TaSIL operates as a mid-level controller that generates reference commands for the low-level $\mathcal{L}_1$-DRAC.
  • Figure 2: Top panel: The expert trajectories available for imitation learning in our formulation are generated by the uncertain (true) system operating under some expert input process. The expert trajectory data is then used to learn a nominal model whose predictive performance is only guaranteed on the expert trajectories. Bottom panel: The methodology of TaSIL is designed to be robust against distribution shifts due to a difference between the expert and optimal policies. On the other hand, $\mathcal{L}_1$-DRAC is designed to be robust against the effects of inaccuracies between the true and surrogate models. Together, TaSIL and $\mathcal{L}_1$-DRAC can thus offer robustness guarantees against a comprehensive set of distribution shift sources.
  • Figure 3: Distribution shift sources and mitigation. (\ref{['fig:dist-shift-policy']}) Distribution shift from learned policy deviating from expert. (\ref{['fig:dist-shift-uncertainty']}) Distribution shift from epistemic and aleatoric uncertainties. (\ref{['fig:dist-shift-drip']}) Unified robustness via TaSIL + $\mathcal{L}_1$-DRAC layered architecture.
  • Figure 4: The architecture of the $\mathcal{L}_1$-DRAC controller. The controller has three components: a process predictor with output $\hat{X}_t$, an adaptation law, and a low pass filter.
  • Figure 5: Comparison of total imitation gap under i) TaSIL for the nominal system, ii) TaSIL for the uncertain system, and iii) TaSIL and $\mathcal{L}_1$-DRAC for the uncertain system.

Theorems & Definitions (13)

  • Definition 1: Vector Fields
  • Definition 2: Systems
  • Remark II.1
  • Definition 3: Training Data
  • Definition 4: TaSIL - $\mathcal{L}_1$-DRAC Error Process
  • Definition 5
  • Proposition III.1
  • proof
  • Corollary III.1
  • Remark III.1
  • ...and 3 more