Table of Contents
Fetching ...

Online Control in Population Dynamics

Noah Golowich, Elad Hazan, Zhou Lu, Dhruv Rohatgi, Y. Jennifer Sun

TL;DR

The paper develops a robust online control framework for population dynamics by formulating simplex-based linear dynamical systems (LDS) and introducing a gradient-based controller, GPC- Simplex, with regret guarantees against a mixing-time-based comparator class. It addresses adversarial disturbances and time-varying costs, extending beyond noiseless models common in epidemiology and population dynamics. Theoretical results establish a near-optimal regret bound of \\tilde{O}(\\tau^{7/2} \\sqrt{dT}) against the class \\mathcal{K}^{\\triangle}_{\\tau}(\\mathcal{L}), along with a lower bound showing the necessity of the mixing assumption; a converse lower bound against broader policy classes is provided. Empirically, the approach applies to nonlinear dynamics such as SIR and replicator dynamics, including disease-control and hospital-flow scenarios where GPC- Simplex learns timely interventions and robustly handles perturbations. Overall, the work bridges online control and population dynamics, enabling scalable, provably robust, real-time control in epidemiological and ecological contexts under adversarial uncertainty.

Abstract

The study of population dynamics originated with early sociological works but has since extended into many fields, including biology, epidemiology, evolutionary game theory, and economics. Most studies on population dynamics focus on the problem of prediction rather than control. Existing mathematical models for control in population dynamics are often restricted to specific, noise-free dynamics, while real-world population changes can be complex and adversarial. To address this gap, we propose a new framework based on the paradigm of online control. We first characterize a set of linear dynamical systems that can naturally model evolving populations. We then give an efficient gradient-based controller for these systems, with near-optimal regret bounds with respect to a broad class of linear policies. Our empirical evaluations demonstrate the effectiveness of the proposed algorithm for control in population dynamics even for non-linear models such as SIR and replicator dynamics.

Online Control in Population Dynamics

TL;DR

The paper develops a robust online control framework for population dynamics by formulating simplex-based linear dynamical systems (LDS) and introducing a gradient-based controller, GPC- Simplex, with regret guarantees against a mixing-time-based comparator class. It addresses adversarial disturbances and time-varying costs, extending beyond noiseless models common in epidemiology and population dynamics. Theoretical results establish a near-optimal regret bound of \\tilde{O}(\\tau^{7/2} \\sqrt{dT}) against the class \\mathcal{K}^{\\triangle}_{\\tau}(\\mathcal{L}), along with a lower bound showing the necessity of the mixing assumption; a converse lower bound against broader policy classes is provided. Empirically, the approach applies to nonlinear dynamics such as SIR and replicator dynamics, including disease-control and hospital-flow scenarios where GPC- Simplex learns timely interventions and robustly handles perturbations. Overall, the work bridges online control and population dynamics, enabling scalable, provably robust, real-time control in epidemiological and ecological contexts under adversarial uncertainty.

Abstract

The study of population dynamics originated with early sociological works but has since extended into many fields, including biology, epidemiology, evolutionary game theory, and economics. Most studies on population dynamics focus on the problem of prediction rather than control. Existing mathematical models for control in population dynamics are often restricted to specific, noise-free dynamics, while real-world population changes can be complex and adversarial. To address this gap, we propose a new framework based on the paradigm of online control. We first characterize a set of linear dynamical systems that can naturally model evolving populations. We then give an efficient gradient-based controller for these systems, with near-optimal regret bounds with respect to a broad class of linear policies. Our empirical evaluations demonstrate the effectiveness of the proposed algorithm for control in population dynamics even for non-linear models such as SIR and replicator dynamics.
Paper Structure (50 sections, 20 theorems, 112 equations, 8 figures, 2 algorithms)

This paper contains 50 sections, 20 theorems, 112 equations, 8 figures, 2 algorithms.

Key Result

Theorem 1

There is a distribution $\mathcal{D}$ over LDSs with state space and control space given by $\mathbb{R}$, such that any online control algorithm on a system $\mathcal{L} \sim \mathcal{D}$ incurs expected regret $\Omega(T)$ against the class of time-invariant linear policies that marginally stabilize

Figures (8)

  • Figure 1: Control with cost function \ref{['eq:quad-cost']} for $T=200$ steps: initial distribution $x_1=[0.9, 0.1, 0.0]$; parameters $c_3=10$, $c_2=1$; no noise. Left/Middle: Cost and cumulative cost over time of $\mathtt{GPC- Simplex}\xspace$ versus baselines. Right: control $u_t(2)$ (proportional to effective transmission rate) played by $\mathtt{GPC- Simplex}\xspace$ over time.
  • Figure 2: Controlling hospital flows for $T=100$ steps: initial distribution $[0.9, 0.01, 0.09]$; parameters $y_{max}=0.1$, $c_2=0.01$, $c_3=100$. Left: The dashed red line shows the number of infected over time under no control; note that $y_{\max}$ (shown in dashed purple line) is significantly exceeded. The solid yellow and blue lines show the number of infected and susceptible under $\mathtt{GPC- Simplex}\xspace$, which closely match the optimal solutions computed by ketcheson2021sir (dashed yellow and blue). Right:$\mathtt{GPC- Simplex}\xspace$ control (solid) vs. optimal control (dashed).
  • Figure 3: An intuitive illustration of $x_t(2)$ in the lower bound for simplex LDS (\ref{['thm:lb-simplex']}). The blue curve is the trajectory of $\pi^0$, the "decreasing" comparator policy, in the system $\mathcal{L}^0$, which has the smaller perturbation. The green curve is $\pi^1$, the "lazy" comparator policy, in the system $\mathcal{L}^1$, which has the larger perturbation. The orange curves correspond to the trajectories of an arbitrary policy $\pi$ under the two different perturbation sequences. The sum of regret under the two perturbation sequences is equal to the area $S_1+S_2+S_3$, which is shown to be $\Omega(T)$ for any $h$.
  • Figure 4: SIR with perturbations. $T=200$. Initial state $x_1=[0.9, 0.1,0]$. $\mathtt{GPC- Simplex}\xspace$ parameter $H=5$. Top: Perturbation sequence: $w_t=[0, 1, 0], \forall 1\le t\le 200$. $\gamma_t\sim 0.01\cdot \mathrm{Ber}(0.2), \forall 1\le t\le 200$. Bottom: Perturbation sequence: $\forall t$, $w_t$ is a normalized uniform random vector. $\gamma_t=0.01$, $\forall 1\le t\le 200$.
  • Figure 5: Control with costs: control over $T=200$ steps. $\gamma_t=0$, $\forall t$. SIR parameters: $\beta=0.5, \theta=0.03, \xi=0.005$. Initial state $x_1=[0.9, 0.1, 0]$. $\mathtt{GPC- Simplex}\xspace$ parameters: $H=5$. Left: instantaneous cost over time, compared with that of no control (green) and full control (orange). Middle: cumulative cost over time. Right: $u_t(2)$ output by $\mathtt{GPC- Simplex}\xspace$ over time. $(c_2,c_3)$ values (from top to bottom rows): $(1, 20), (1,10), (1,5), (1,1)$.
  • ...and 3 more figures

Theorems & Definitions (48)

  • Theorem 1: Informal statement of \ref{['thm:lb-marginally-stable']}
  • Theorem 2: Informal version of \ref{['thm:stoch-lds-regret']}
  • Definition 3: Simplex LDS
  • Definition 4
  • Definition 5
  • Definition 6: Mixing a simplex LDS
  • Theorem 7
  • Theorem 8: Informal statement of \ref{['thm:lb-simplex']}
  • Definition 9: LDS
  • Lemma 10: Mirror descent
  • ...and 38 more