Infinite-Horizon Distributionally Robust Regret-Optimal Control

Taylan Kargin; Joudi Hajar; Vikrant Malik; Babak Hassibi

Infinite-Horizon Distributionally Robust Regret-Optimal Control

Taylan Kargin, Joudi Hajar, Vikrant Malik, Babak Hassibi

TL;DR

The paper tackles infinite-horizon distributionally robust regret-optimal control for discrete-time LTI systems under a Wasserstein-2 ambiguity set around disturbances, aiming to minimize the steady-state worst-case expected regret relative to a clairvoyant non-causal policy. It derives a tractable saddle-point formulation and proves strong duality, showing the optimal DR-RO controller is non-rational but admits a finite-dimensional parameterization that can be computed efficiently in the frequency domain using a Frank-Wolfe-type algorithm. A key contribution is a convex, LMIs-based procedure to approximate the non-rational controller with a near-optimal rational $H_inity$-norm state-space controller, enabling practical real-time implementations. The approach yields stability guarantees and robustness to time-correlated disturbances, and its empirical results demonstrate superior performance of the infinite-horizon DR-RO controller over finite-horizon methods and standard $H_2$/$H_\infty$ baselines, with rational approximations closely matching non-rational performance. Overall, the work delivers a scalable, implementable framework for DR regret-optimal control under distributional uncertainty and time correlation, bridging theory and practical controller design.

Abstract

We study the infinite-horizon distributionally robust (DR) control of linear systems with quadratic costs, where disturbances have unknown, possibly time-correlated distribution within a Wasserstein-2 ambiguity set. We aim to minimize the worst-case expected regret-the excess cost of a causal policy compared to a non-causal one with access to future disturbance. Though the optimal policy lacks a finite-order state-space realization (i.e., it is non-rational), it can be characterized by a finite-dimensional parameter. Leveraging this, we develop an efficient frequency-domain algorithm to compute this optimal control policy and present a convex optimization method to construct a near-optimal state-space controller that approximates the optimal non-rational controller in the $\mathit{H}_\infty$-norm. This approach avoids solving a computationally expensive semi-definite program (SDP) that scales with the time horizon in the finite-horizon setting.

Infinite-Horizon Distributionally Robust Regret-Optimal Control

TL;DR

-norm state-space controller, enabling practical real-time implementations. The approach yields stability guarantees and robustness to time-correlated disturbances, and its empirical results demonstrate superior performance of the infinite-horizon DR-RO controller over finite-horizon methods and standard

baselines, with rational approximations closely matching non-rational performance. Overall, the work delivers a scalable, implementable framework for DR regret-optimal control under distributional uncertainty and time correlation, bridging theory and practical controller design.

Abstract

-norm. This approach avoids solving a computationally expensive semi-definite program (SDP) that scales with the time horizon in the finite-horizon setting.

Paper Structure (50 sections, 18 theorems, 84 equations, 8 figures, 2 tables, 4 algorithms)

This paper contains 50 sections, 18 theorems, 84 equations, 8 figures, 2 tables, 4 algorithms.

Introduction
Contributions
Preliminaries
Linear-Quadratic Control
The Regret-Optimal Control Framework
Distributionally Robust Regret-Optimal Control
A Saddle-Point Problem
An Efficient Algorithm
Finite-Dimensional Parametrization of $\mathcal{M}_\star$
An Iterative Optimization in the Frequency Domain
Rational Approximation
State-Space Models from Rational Power Spectra
Rational Approximation using $\mathit{H}_\infty$-norm
Obtaining State-Space Controllers
Numerical Experiments
...and 35 more sections

Key Result

Theorem 3.1

Under asmp:nominal, the steady-state worst-case expected regret $\overline{\operatorname{\textsf{\small Reg}}}_{\infty}(\mathcal{K},r)$ incurred by a causal policy $\mathcal{K}\!\in\!\mathscr{K}$ is equivalent to the following: which takes a finite value whenever $\mathcal{R}_\mathcal{K}$ is bounded. Additionally, the worst-case disturbance is obtained from $w_\star \coloneqq (\mathcal{I}-\gamma_

Figures (8)

Figure 1: Variation of $\mathcal{N}$ with $r$ and the performance of the DR-RO controller versus the $\mathit{H}_2, \mathit{H}_{\infty}$, and $RO$ controller.
Figure 2: The control costs of different DR controllers under (a) white noise and (b) worst disturbance for DR-RO in infinite horizon, for system [AC15]. The finite-horizon controllers are re-applied every $s=30$ steps. The infinite horizon DR-RO controller achieves the lowest average cost compared to the finite-horizon controllers.
Figure 3: The control costs of different DR controllers under (a) worst disturbances for DR-RO in finite horizon and (b) worst disturbances for DR-LQR in finite horizon, for system [AC15]. The finite-horizon controllers are re-applied every $s=30$ steps. Despite being designed to minimize the cost under specific disturbances, the finite horizon DR controllers are outperformed by the infinite horizon DR-RO controller.
Figure 4: The control costs of the different DR controllers: (I) DR-RO in infinite horizon, (II) DR-RO in finite horizon and (III) DR-LQR in finite horizon under different disturbances for system [REA4] aircraft. (a) is white noise, while (b), (c) and (d) are worst-case disturbances for each of the controllers, for $r=1.5$. The finite-horizon controllers are re-applied every $s=30$ steps. For all disturbances, the infinite horizon DRRO controller achieves lowest average cost, even in cases (c) and (d) where the finite horizon DR controllers are designed to minimize the cost.
Figure 5: The control costs of the different DR controllers: (I) DR-RO in infinite horizon, (II) DR-RO in finite horizon and (III) DR-LQR in finite horizon under different disturbances for system [HE3] aircraft. (a) is white noise, while (b), (c) and (d) are worst-case disturbances for each of the controllers, for $r=1.5$. The finite-horizon controllers are re-applied every $s=30$ steps. For all disturbances, the infinite horizon DRRO controller achieves lowest average cost, even in cases (c) and (d) where the finite horizon DR controllers are designed to minimize the cost.
...and 3 more figures

Theorems & Definitions (25)

Definition 2.1
Theorem 3.1: A Variational Formula for $\overline{\operatorname{\textsf{\small Reg}}}_{\infty}$ kargin2023wasserstein
Theorem 3.2: A saddle-point problem for DR-RO
Remark 3.3
Corollary 3.4
Lemma 4.1
Theorem 4.2
Corollary 4.3
Remark 4.4
Theorem 4.5: Convergence of $\mathcal{M}_k$
...and 15 more

Infinite-Horizon Distributionally Robust Regret-Optimal Control

TL;DR

Abstract

Infinite-Horizon Distributionally Robust Regret-Optimal Control

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (8)

Theorems & Definitions (25)