Table of Contents
Fetching ...

Online Optimization and Ambiguity-based Learning of Distributionally Uncertain Dynamic Systems

Dan Li, Dariush Fooladivanda, Sonia Martinez

TL;DR

The paper addresses online optimization for systems with distributional uncertainty in dynamics by learning a control-dependent ambiguity set within a Wasserstein framework. It develops a tractable reformulation that upper-bounds the worst-case expected loss over the ambiguity set using an empirical center plus a Lipschitz penalty, and specializes to optimal control and online resource allocation. An online smoothing scheme (Moreau-Yosida) paired with a Nesterov-inspired accelerated-gradient method yields provable regret bounds via dissipativity analysis, with guarantees that improve as more history is assimilated. The approach is demonstrated on a nonlinear vehicle control problem and an online asset-allocation task, showing real-time learning of dynamics and low probabilistic regret, while maintaining online tractability. Overall, the work provides a principled, data-driven method to achieve online decisions with probabilistic performance guarantees under distributional uncertainty, offering a less conservative alternative to classic robust methods.

Abstract

This paper proposes a novel approach to construct data-driven online solutions to optimization problems (P) subject to a class of distributionally uncertain dynamical systems. The introduced framework allows for the simultaneous learning of distributional system uncertainty via a parameterized, control-dependent ambiguity set using a finite historical data set, and its use to make online decisions with probabilistic regret function bounds. Leveraging the merits of Machine Learning, the main technical approach relies on the theory of Distributional Robust Optimization (DRO), to hedge against uncertainty and provide less conservative results than standard Robust Optimization approaches. Starting from recent results that describe ambiguity sets via parameterized, and control-dependent empirical distributions as well as ambiguity radii, we first present a tractable reformulation of the corresponding optimization problem while maintaining the probabilistic guarantees. We then specialize these problems to the cases of 1) optimal one-stage control of distributionally uncertain nonlinear systems, and 2) resource allocation under distributional uncertainty. A novelty of this work is that it extends DRO to online optimization problems subject to a distributionally uncertain dynamical system constraint, handled via a control-dependent ambiguity set that leads to online-tractable optimization with probabilistic guarantees on regret bounds. Further, we introduce an online version of Nesterov's accelerated-gradient algorithm, and analyze its performance to solve this class of problems via dissipativity theory.

Online Optimization and Ambiguity-based Learning of Distributionally Uncertain Dynamic Systems

TL;DR

The paper addresses online optimization for systems with distributional uncertainty in dynamics by learning a control-dependent ambiguity set within a Wasserstein framework. It develops a tractable reformulation that upper-bounds the worst-case expected loss over the ambiguity set using an empirical center plus a Lipschitz penalty, and specializes to optimal control and online resource allocation. An online smoothing scheme (Moreau-Yosida) paired with a Nesterov-inspired accelerated-gradient method yields provable regret bounds via dissipativity analysis, with guarantees that improve as more history is assimilated. The approach is demonstrated on a nonlinear vehicle control problem and an online asset-allocation task, showing real-time learning of dynamics and low probabilistic regret, while maintaining online tractability. Overall, the work provides a principled, data-driven method to achieve online decisions with probabilistic performance guarantees under distributional uncertainty, offering a less conservative alternative to classic robust methods.

Abstract

This paper proposes a novel approach to construct data-driven online solutions to optimization problems (P) subject to a class of distributionally uncertain dynamical systems. The introduced framework allows for the simultaneous learning of distributional system uncertainty via a parameterized, control-dependent ambiguity set using a finite historical data set, and its use to make online decisions with probabilistic regret function bounds. Leveraging the merits of Machine Learning, the main technical approach relies on the theory of Distributional Robust Optimization (DRO), to hedge against uncertainty and provide less conservative results than standard Robust Optimization approaches. Starting from recent results that describe ambiguity sets via parameterized, and control-dependent empirical distributions as well as ambiguity radii, we first present a tractable reformulation of the corresponding optimization problem while maintaining the probabilistic guarantees. We then specialize these problems to the cases of 1) optimal one-stage control of distributionally uncertain nonlinear systems, and 2) resource allocation under distributional uncertainty. A novelty of this work is that it extends DRO to online optimization problems subject to a distributionally uncertain dynamical system constraint, handled via a control-dependent ambiguity set that leads to online-tractable optimization with probabilistic guarantees on regret bounds. Further, we introduce an online version of Nesterov's accelerated-gradient algorithm, and analyze its performance to solve this class of problems via dissipativity theory.

Paper Structure

This paper contains 13 sections, 8 theorems, 134 equations, 9 figures, 1 algorithm.

Key Result

Theorem 3.1

Let Assumptions assump:subG and assump:predictor hold. For a given $T \in \mathbb{Z}_{> 0}$, historical data $\{ \hat{\boldsymbol{x}}_k \}_{k\in\mathcal{T}}$ and $\{ \boldsymbol{u}_k \}_{k\in\mathcal{T}\setminus\{t\}}$, $\mathcal{T}= \{t-T,\dots, t\}$, we select $\hat{\mathbb{P}}_{t+1|t}$ as in eq:e Here, the left-hand-side expression is a shorthand for the probability of the event $\{(\boldsymbol

Figures (9)

  • Figure 1: A two-wheeled vehicle model with $(x,y)\in \mathbb{R}^2$ the position of the center and $\theta$ the direction.
  • Figure 2: The (gray) planned trajectory and (black) actual system trajectory in various road zones, with the system state $\boldsymbol{x}=(x,y,\theta) \in \mathbb{R}^2 \times [-\pi,\pi)$. The red region indicates sandy zone while the blue region indicates the slippery zone. Due to unknown road conditions, the actual system trajectories deviate from planned trajectories.
  • Figure 3: An example of the (gray) planned trajectory and (black) controlled system trajectory in various road zones, with the system state $\boldsymbol{x}=(x,y,\theta)$. The red region indicates sandy zone while the blue region indicates the slippery zone. With the implemented control, the vehicle follows the planned path with low regrets in high probability.
  • Figure 4: (a) The (gray) control signal provided by the planner and an example of the (black) control signal derived from the proposed approach. (b) The realized loss $\ell$ and the achieved objective of \ref{['eq:P2']}.
  • Figure 5: The component $\alpha_1$ and $\alpha_2$ of the real-time parameter $\boldsymbol{\alpha}:=(\alpha_1,\alpha_2,\alpha_3)$ in the learning procedure.
  • ...and 4 more figures

Theorems & Definitions (17)

  • Remark 3.1: On sub-Gaussian distributions
  • Theorem 3.1: Online probabilistic guarantee DL-DF-SM:20-lcss
  • Lemma 4.1: An upper bound of \ref{['eq:P1']}
  • Theorem 4.1: Equivalent reformulation of \ref{['eq:P1']}
  • Remark 4.1: Effects of Assumptions \ref{['assump:cvxloss']} and \ref{['assump:gradient']}
  • Lemma 4.2: Quantification of $L$
  • Definition 5.1: Smoothable function AB-MT:12
  • Lemma 5.1: Moreau-Yosida approximation
  • Lemma 5.2: Examples of \ref{['eq:P2smooth']}
  • Theorem 5.1: Probabilistic regret bound of \ref{['eq:P1']}
  • ...and 7 more