Table of Contents
Fetching ...

Data-Driven Adversarial Online Control for Unknown Linear Systems

Zishun Liu, Yongxin Chen

TL;DR

This work tackles adversarial online control for unknown LTI systems by combining behavioral-system theory with a data-driven, Hankel-based non-parametric representation, bypassing explicit system identification. It introduces an adaptive disturbance-based controller (ADAC) whose parameters are updated by online gradient descent within an ETC framework, and proves a high-probability regret of $ $\tilde{\mathcal{O}}(T^{2/3})$$, matching the best-known model-based guarantees. The approach extends naturally to output feedback via an observed-disturbance ADAC/ODAC formulation, preserving the same sublinear regret and offering computational advantages over SDP-based or full-system identification methods. Overall, the paper bridges fundamental behavioral principles with online learning to deliver scalable, data-driven control in adversarial settings, with practical implications for unknown-dynamics control where data are plentiful but models are incomplete.

Abstract

We consider the online control problem with an unknown linear dynamical system in the presence of adversarial perturbations and adversarial convex loss functions. Although the problem is widely studied in model-based control, it remains unclear whether data-driven approaches, which bypass the system identification step, can solve the problem. In this work, we present a novel data-driven online adaptive control algorithm to address this online control problem. Our algorithm leverages the behavioral systems theory to learn a non-parametric system representation and then adopts a perturbation-based controller updated by online gradient descent. We prove that our algorithm guarantees an $\tmO(T^{2/3})$ regret bound with high probability, which matches the best-known regret bound for this problem. Furthermore, we extend our algorithm and performance guarantee to the cases with output feedback.

Data-Driven Adversarial Online Control for Unknown Linear Systems

TL;DR

This work tackles adversarial online control for unknown LTI systems by combining behavioral-system theory with a data-driven, Hankel-based non-parametric representation, bypassing explicit system identification. It introduces an adaptive disturbance-based controller (ADAC) whose parameters are updated by online gradient descent within an ETC framework, and proves a high-probability regret of \tilde{\mathcal{O}}(T^{2/3})$$, matching the best-known model-based guarantees. The approach extends naturally to output feedback via an observed-disturbance ADAC/ODAC formulation, preserving the same sublinear regret and offering computational advantages over SDP-based or full-system identification methods. Overall, the paper bridges fundamental behavioral principles with online learning to deliver scalable, data-driven control in adversarial settings, with practical implications for unknown-dynamics control where data are plentiful but models are incomplete.

Abstract

We consider the online control problem with an unknown linear dynamical system in the presence of adversarial perturbations and adversarial convex loss functions. Although the problem is widely studied in model-based control, it remains unclear whether data-driven approaches, which bypass the system identification step, can solve the problem. In this work, we present a novel data-driven online adaptive control algorithm to address this online control problem. Our algorithm leverages the behavioral systems theory to learn a non-parametric system representation and then adopts a perturbation-based controller updated by online gradient descent. We prove that our algorithm guarantees an regret bound with high probability, which matches the best-known regret bound for this problem. Furthermore, we extend our algorithm and performance guarantee to the cases with output feedback.
Paper Structure (31 sections, 12 theorems, 92 equations, 5 algorithms)

This paper contains 31 sections, 12 theorems, 92 equations, 5 algorithms.

Key Result

Lemma 3.1

Suppose Assumption as: AB, as: bounded_w hold. Then, given a clean trajectory $\{x^d,u^d\}=\{x_1^d,u_1^d,\dots,$$x_{N}^d,u_{N}^d\}$ of sys: noise-LTI where $u^d$ is persistently exciting of order $L+2n$, the sequence $\{x_0,u_0,\dots,x_{L-1},u_{L-1}\}$ is a trajectory of sys: noise-LTI if and only i Moreover, in this case,

Theorems & Definitions (19)

  • Definition 2.1
  • Definition 2.2
  • Definition 3.1
  • Definition 3.2
  • Lemma 3.1
  • Definition 3.3
  • Lemma 5.1
  • Lemma 5.2
  • Lemma 5.3
  • Lemma 5.4
  • ...and 9 more