Table of Contents
Fetching ...

Infinite-Horizon Distributionally Robust Regret-Optimal Control

Taylan Kargin, Joudi Hajar, Vikrant Malik, Babak Hassibi

TL;DR

The paper tackles infinite-horizon distributionally robust regret-optimal control for discrete-time LTI systems under a Wasserstein-2 ambiguity set around disturbances, aiming to minimize the steady-state worst-case expected regret relative to a clairvoyant non-causal policy. It derives a tractable saddle-point formulation and proves strong duality, showing the optimal DR-RO controller is non-rational but admits a finite-dimensional parameterization that can be computed efficiently in the frequency domain using a Frank-Wolfe-type algorithm. A key contribution is a convex, LMIs-based procedure to approximate the non-rational controller with a near-optimal rational $H_inity$-norm state-space controller, enabling practical real-time implementations. The approach yields stability guarantees and robustness to time-correlated disturbances, and its empirical results demonstrate superior performance of the infinite-horizon DR-RO controller over finite-horizon methods and standard $H_2$/$H_\infty$ baselines, with rational approximations closely matching non-rational performance. Overall, the work delivers a scalable, implementable framework for DR regret-optimal control under distributional uncertainty and time correlation, bridging theory and practical controller design.

Abstract

We study the infinite-horizon distributionally robust (DR) control of linear systems with quadratic costs, where disturbances have unknown, possibly time-correlated distribution within a Wasserstein-2 ambiguity set. We aim to minimize the worst-case expected regret-the excess cost of a causal policy compared to a non-causal one with access to future disturbance. Though the optimal policy lacks a finite-order state-space realization (i.e., it is non-rational), it can be characterized by a finite-dimensional parameter. Leveraging this, we develop an efficient frequency-domain algorithm to compute this optimal control policy and present a convex optimization method to construct a near-optimal state-space controller that approximates the optimal non-rational controller in the $\mathit{H}_\infty$-norm. This approach avoids solving a computationally expensive semi-definite program (SDP) that scales with the time horizon in the finite-horizon setting.

Infinite-Horizon Distributionally Robust Regret-Optimal Control

TL;DR

The paper tackles infinite-horizon distributionally robust regret-optimal control for discrete-time LTI systems under a Wasserstein-2 ambiguity set around disturbances, aiming to minimize the steady-state worst-case expected regret relative to a clairvoyant non-causal policy. It derives a tractable saddle-point formulation and proves strong duality, showing the optimal DR-RO controller is non-rational but admits a finite-dimensional parameterization that can be computed efficiently in the frequency domain using a Frank-Wolfe-type algorithm. A key contribution is a convex, LMIs-based procedure to approximate the non-rational controller with a near-optimal rational -norm state-space controller, enabling practical real-time implementations. The approach yields stability guarantees and robustness to time-correlated disturbances, and its empirical results demonstrate superior performance of the infinite-horizon DR-RO controller over finite-horizon methods and standard / baselines, with rational approximations closely matching non-rational performance. Overall, the work delivers a scalable, implementable framework for DR regret-optimal control under distributional uncertainty and time correlation, bridging theory and practical controller design.

Abstract

We study the infinite-horizon distributionally robust (DR) control of linear systems with quadratic costs, where disturbances have unknown, possibly time-correlated distribution within a Wasserstein-2 ambiguity set. We aim to minimize the worst-case expected regret-the excess cost of a causal policy compared to a non-causal one with access to future disturbance. Though the optimal policy lacks a finite-order state-space realization (i.e., it is non-rational), it can be characterized by a finite-dimensional parameter. Leveraging this, we develop an efficient frequency-domain algorithm to compute this optimal control policy and present a convex optimization method to construct a near-optimal state-space controller that approximates the optimal non-rational controller in the -norm. This approach avoids solving a computationally expensive semi-definite program (SDP) that scales with the time horizon in the finite-horizon setting.
Paper Structure (50 sections, 18 theorems, 84 equations, 8 figures, 2 tables, 4 algorithms)

This paper contains 50 sections, 18 theorems, 84 equations, 8 figures, 2 tables, 4 algorithms.

Key Result

Theorem 3.1

Under asmp:nominal, the steady-state worst-case expected regret $\overline{\operatorname{\textsf{\small Reg}}}_{\infty}(\mathcal{K},r)$ incurred by a causal policy $\mathcal{K}\!\in\!\mathscr{K}$ is equivalent to the following: which takes a finite value whenever $\mathcal{R}_\mathcal{K}$ is bounded. Additionally, the worst-case disturbance is obtained from $w_\star \coloneqq (\mathcal{I}-\gamma_

Figures (8)

  • Figure 1: Variation of $\mathcal{N}$ with $r$ and the performance of the DR-RO controller versus the $\mathit{H}_2, \mathit{H}_{\infty}$, and $RO$ controller.
  • Figure 2: The control costs of different DR controllers under (a) white noise and (b) worst disturbance for DR-RO in infinite horizon, for system [AC15]. The finite-horizon controllers are re-applied every $s=30$ steps. The infinite horizon DR-RO controller achieves the lowest average cost compared to the finite-horizon controllers.
  • Figure 3: The control costs of different DR controllers under (a) worst disturbances for DR-RO in finite horizon and (b) worst disturbances for DR-LQR in finite horizon, for system [AC15]. The finite-horizon controllers are re-applied every $s=30$ steps. Despite being designed to minimize the cost under specific disturbances, the finite horizon DR controllers are outperformed by the infinite horizon DR-RO controller.
  • Figure 4: The control costs of the different DR controllers: (I) DR-RO in infinite horizon, (II) DR-RO in finite horizon and (III) DR-LQR in finite horizon under different disturbances for system [REA4] aircraft. (a) is white noise, while (b), (c) and (d) are worst-case disturbances for each of the controllers, for $r=1.5$. The finite-horizon controllers are re-applied every $s=30$ steps. For all disturbances, the infinite horizon DRRO controller achieves lowest average cost, even in cases (c) and (d) where the finite horizon DR controllers are designed to minimize the cost.
  • Figure 5: The control costs of the different DR controllers: (I) DR-RO in infinite horizon, (II) DR-RO in finite horizon and (III) DR-LQR in finite horizon under different disturbances for system [HE3] aircraft. (a) is white noise, while (b), (c) and (d) are worst-case disturbances for each of the controllers, for $r=1.5$. The finite-horizon controllers are re-applied every $s=30$ steps. For all disturbances, the infinite horizon DRRO controller achieves lowest average cost, even in cases (c) and (d) where the finite horizon DR controllers are designed to minimize the cost.
  • ...and 3 more figures

Theorems & Definitions (25)

  • Definition 2.1
  • Theorem 3.1: A Variational Formula for $\overline{\operatorname{\textsf{\small Reg}}}_{\infty}$ kargin2023wasserstein
  • Theorem 3.2: A saddle-point problem for DR-RO
  • Remark 3.3
  • Corollary 3.4
  • Lemma 4.1
  • Theorem 4.2
  • Corollary 4.3
  • Remark 4.4
  • Theorem 4.5: Convergence of $\mathcal{M}_k$
  • ...and 15 more