Table of Contents
Fetching ...

ABM-UDE: Developing Surrogates for Epidemic Agent-Based Models via Scientific Machine Learning

Sharv Murgai, Utkarsh Utkarsh, Kyle C. Nguyen, Alan Edelman, Erin C. S. Acquesta, Christopher Vincent Rackauckas

TL;DR

County-ready surrogates that learn directly from exascale ABM trajectories using Universal Differential Equations using Universal Differential Equations (UDEs) are developed, providing a portable path to distill agent-based simulators into fast, trustworthy surrogates for other scientific domains.

Abstract

Agent-based epidemic models (ABMs) encode behavioral and policy heterogeneity but are too slow for nightly hospital planning. We develop county-ready surrogates that learn directly from exascale ABM trajectories using Universal Differential Equations (UDEs): mechanistic SEIR-family ODEs with a neural-parameterized contact rate $κ_φ(u,t)$ (no additive residual). Our contributions are threefold: we adapt multiple shooting and an observer-based prediction-error method (PEM) to stabilize identification of neural-augmented epidemiological dynamics across intervention-driven regime shifts; we enforce positivity and mass conservation and show the learned contact-rate parameterization yields a well-posed vector field; and we quantify accuracy, calibration, and compute against ABM ensembles and UDE baselines. On a representative ExaEpi scenario, PEM-UDE reduces mean MSE by 77% relative to single-shooting UDE (3.00 vs. 13.14) and by 20% relative to MS-UDE (3.75). Reliability improves in parallel: empirical coverage of ABM $10$-$90$% and $25$-$75$% bands rises from 0.68/0.43 (UDE) and 0.79/0.55 (MS-UDE) to 0.86/0.61 with PEM-UDE and 0.94/0.69 with MS+PEM-UDE, indicating calibrated uncertainty rather than overconfident fits. Inference runs in seconds on commodity CPUs (20-35 s per $\sim$90-day forecast), enabling nightly ''what-if'' sweeps on a laptop. Relative to a $\sim$100 CPU-hour ABM reference run, this yields $\sim10^{4}\times$ lower wall-clock per scenario. This closes the realism-cadence gap, supports threshold-aware decision-making (e.g., maintaining ICU occupancy $<75$%), preserves mechanistic interpretability, and enables calibrated, risk-aware scenario planning on standard institutional hardware. Beyond epidemics, the ABM$\to$UDE recipe provides a portable path to distill agent-based simulators into fast, trustworthy surrogates for other scientific domains.

ABM-UDE: Developing Surrogates for Epidemic Agent-Based Models via Scientific Machine Learning

TL;DR

County-ready surrogates that learn directly from exascale ABM trajectories using Universal Differential Equations using Universal Differential Equations (UDEs) are developed, providing a portable path to distill agent-based simulators into fast, trustworthy surrogates for other scientific domains.

Abstract

Agent-based epidemic models (ABMs) encode behavioral and policy heterogeneity but are too slow for nightly hospital planning. We develop county-ready surrogates that learn directly from exascale ABM trajectories using Universal Differential Equations (UDEs): mechanistic SEIR-family ODEs with a neural-parameterized contact rate (no additive residual). Our contributions are threefold: we adapt multiple shooting and an observer-based prediction-error method (PEM) to stabilize identification of neural-augmented epidemiological dynamics across intervention-driven regime shifts; we enforce positivity and mass conservation and show the learned contact-rate parameterization yields a well-posed vector field; and we quantify accuracy, calibration, and compute against ABM ensembles and UDE baselines. On a representative ExaEpi scenario, PEM-UDE reduces mean MSE by 77% relative to single-shooting UDE (3.00 vs. 13.14) and by 20% relative to MS-UDE (3.75). Reliability improves in parallel: empirical coverage of ABM -% and -% bands rises from 0.68/0.43 (UDE) and 0.79/0.55 (MS-UDE) to 0.86/0.61 with PEM-UDE and 0.94/0.69 with MS+PEM-UDE, indicating calibrated uncertainty rather than overconfident fits. Inference runs in seconds on commodity CPUs (20-35 s per 90-day forecast), enabling nightly ''what-if'' sweeps on a laptop. Relative to a 100 CPU-hour ABM reference run, this yields lower wall-clock per scenario. This closes the realism-cadence gap, supports threshold-aware decision-making (e.g., maintaining ICU occupancy %), preserves mechanistic interpretability, and enables calibrated, risk-aware scenario planning on standard institutional hardware. Beyond epidemics, the ABMUDE recipe provides a portable path to distill agent-based simulators into fast, trustworthy surrogates for other scientific domains.
Paper Structure (38 sections, 26 equations, 3 figures, 4 tables)

This paper contains 38 sections, 26 equations, 3 figures, 4 tables.

Figures (3)

  • Figure 1: Pipeline overview (one-scenario evaluation).Data: ExaEpi (ABM) generates a single representative SEInsIsIaDR outbreak trajectory (Dataset 1). Known constraints: population balance, compartmental flow, and mechanistic ODE structure constrain learning. Learn dynamics: a neural-parameterized UDE is trained with stabilization strategies: Vanilla UDE, MS-UDE, PEM-UDE, and MS+PEM-UDE. Ensemble robustness: each method is trained as a 100-seed ensemble to quantify sensitivity to initialization and optimization stochasticity. Inference: the trained UDE supports fast “what-if” forward simulation by varying $x_0$ and contact schedules $\kappa(t)$ on commodity hardware.
  • Figure 2: Epochs v MSE for all methods (100-seed avg)
  • Figure 3: Trajectory fit and ensemble robustness (Dataset 1, 100 seeds). Each panel stacks four rows: (top) Vanilla UDE, (second) MS-UDE, (third) PEM-UDE, and (bottom) MS+PEM-UDE. Black x markers denote ABM ground truth. Shaded regions indicate $\pm1\sigma$ and $\pm3\sigma$ variability across 100 independently trained surrogates (random initialization and optimizer stochasticity); band scaling (e.g., "bands $\times 50$") is annotated in the plots. The vertical dotted line marks intervention onset at day 49. MS+PEM-UDE is reported as ensemble mean and uncertainty bands (no single-seed trace).