Introduction to Online Control

Elad Hazan; Karan Singh

Introduction to Online Control

Elad Hazan, Karan Singh

TL;DR

This work introduces online nonstochastic control, reframing dynamical-control problems as online convex optimization under adversarial disturbances and losses. It develops a regret-based framework, where performance is measured against the best policy in hindsight from a benchmark class, and presents algorithms such as the Gradient Perturbation Controller (GPC) and Disturbance Action Controllers (DAC) with sublinear regret guarantees. The text systematically builds from classical control and MDPs to linear dynamical systems, online learning primitives, and system identification, culminating in online Kalman-style filtering and prediction under uncertainty. The approach yields finite-time guarantees and convex-optimization-based methods that are computationally tractable, enabling robust, adaptive control in adversarial and unknown-environment settings with practical impact for autonomous and networked systems.

Abstract

This text presents an introduction to an emerging paradigm in control of dynamical systems and differentiable reinforcement learning called online nonstochastic control. The new approach applies techniques from online convex optimization and convex relaxations to obtain new methods with provable guarantees for classical settings in optimal and robust control. The primary distinction between online nonstochastic control and other frameworks is the objective. In optimal control, robust control, and other control methodologies that assume stochastic noise, the goal is to perform comparably to an offline optimal strategy. In online nonstochastic control, both the cost functions as well as the perturbations from the assumed dynamical model are chosen by an adversary. Thus the optimal policy is not defined a priori. Rather, the target is to attain low regret against the best policy in hindsight from a benchmark class of policies. This objective suggests the use of the decision making framework of online convex optimization as an algorithmic methodology. The resulting methods are based on iterative mathematical optimization algorithms, and are accompanied by finite-time regret and computational complexity guarantees.

Introduction to Online Control

TL;DR

Abstract

Paper Structure (140 sections, 47 theorems, 269 equations, 22 figures, 5 algorithms)

This paper contains 140 sections, 47 theorems, 269 equations, 22 figures, 5 algorithms.

Background in Control and Reinforcement Learning
Introduction
What is This Book About?
The Origins of Control
Formalization and Examples of a Control Problem
Example: Control of a Medical Ventilator
Simple Control Algorithms
The Bang-Bang Controller
The PID Controller
Classical Theory: Optimal and Robust Control
The Need for a New Theory
Online Nonstochastic Control Theory
A New Family of Algorithms
Bibliographic Remarks
Dynamical Systems
...and 125 more sections

Key Result

Theorem 2.5

For a family of dynamical systems described as polynomials with integer coefficients, determining the stabilizability of any member is NP-hard.

Figures (22)

Figure 1: A centrifugal governor.
Figure 2: Performance of the PID controller on a mechanical ventilator, from suo2021machine.
Figure 3: A schematic of the respiratory circuit from suo2021machine.
Figure 4: Double integrator illustration showing state coordinate $x_1(t)$ (position), state coordinate $x_2(t)$ (velocity), mass $m$, and control input $u(t)$.
Figure 5: Pendulum swing-up illustration with marked angle $\theta_t$, gravitational force $mg$, control input torque $u_t$, mass $m$, and rod length $l$.
...and 17 more figures

Theorems & Definitions (102)

Definition 1.1: Dynamical system
Definition 1.2: A generic control problem
Definition 1.3
Definition 2.1
Definition 2.2
Definition 2.3
Definition 2.4
Theorem 2.5
proof
Definition 3.1
...and 92 more

Introduction to Online Control

TL;DR

Abstract

Introduction to Online Control

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (22)

Theorems & Definitions (102)