Table of Contents
Fetching ...

Generalized Machine Learning for Fast Calibration of Agent-Based Epidemic Models

Sima Najafzadehkhoei, George Vega Yon, Derek S. Meyer, Bernardo Modenesi

Abstract

Agent-based models (ABMs) are widely used to study infectious disease dynamics, but their calibration is often computationally intensive, limiting their applicability in time-sensitive public health settings. We propose DeepIMC (Deep Inverse Mapping Calibration), a machine learning-based calibration framework that directly learns the inverse mapping from epidemic time series to epidemiological parameters. DeepIMC trains a bidirectional Long Short-Term Memory (BiLSTM) neural network on synthetic epidemic trajectories generated from agent-based models such as the Susceptible-Infected-Recovered (SIR) model, enabling rapid parameter estimation without repeated simulation at inference time. We evaluate DeepIMC through an extensive simulation study comprising 5,000 heterogeneous epidemic scenarios and benchmark its performance against Approximate Bayesian Computation (ABC) using likelihood-free Markov Chain Monte Carlo. The results show that DeepIMC substantially improves parameter recovery accuracy, produces sharp and well-calibrated predictive intervals, and reduces computational time by more than an order of magnitude relative to ABC. Although structural parameter identifiability constraints limit the precise recovery of all model parameters simultaneously, the calibrated models reliably reproduce epidemic trajectories and support accurate forward prediction with their estimated parameters. DeepIMC is implemented in the open-source R package epiworldRCalibrate, facilitating practical adoption for real-time epidemic modeling and policy analysis. Overall, our findings demonstrate that DeepIMC provides a scalable, operationally effective alternative to traditional simulation-based calibration methods for agent-based epidemic models.

Generalized Machine Learning for Fast Calibration of Agent-Based Epidemic Models

Abstract

Agent-based models (ABMs) are widely used to study infectious disease dynamics, but their calibration is often computationally intensive, limiting their applicability in time-sensitive public health settings. We propose DeepIMC (Deep Inverse Mapping Calibration), a machine learning-based calibration framework that directly learns the inverse mapping from epidemic time series to epidemiological parameters. DeepIMC trains a bidirectional Long Short-Term Memory (BiLSTM) neural network on synthetic epidemic trajectories generated from agent-based models such as the Susceptible-Infected-Recovered (SIR) model, enabling rapid parameter estimation without repeated simulation at inference time. We evaluate DeepIMC through an extensive simulation study comprising 5,000 heterogeneous epidemic scenarios and benchmark its performance against Approximate Bayesian Computation (ABC) using likelihood-free Markov Chain Monte Carlo. The results show that DeepIMC substantially improves parameter recovery accuracy, produces sharp and well-calibrated predictive intervals, and reduces computational time by more than an order of magnitude relative to ABC. Although structural parameter identifiability constraints limit the precise recovery of all model parameters simultaneously, the calibrated models reliably reproduce epidemic trajectories and support accurate forward prediction with their estimated parameters. DeepIMC is implemented in the open-source R package epiworldRCalibrate, facilitating practical adoption for real-time epidemic modeling and policy analysis. Overall, our findings demonstrate that DeepIMC provides a scalable, operationally effective alternative to traditional simulation-based calibration methods for agent-based epidemic models.

Paper Structure

This paper contains 19 sections, 5 equations, 6 figures, 4 tables.

Figures (6)

  • Figure 1: Architecture of the proposed DeepIMC architecture. A univariate incidence time series ($T=60$) is processed through three stacked bidirectional LSTM layers (160 hidden units per direction, dropout rate 0.5). The final forward and backward hidden representations are concatenated and combined with additional epidemiological inputs (population size $n$ and recovery rate). The resulting feature vector (dimension $320 + 2 = 322$) is passed through two fully connected layers ($322 \rightarrow 64 \rightarrow 3$), with ReLU activation in the first layer. The network outputs three epidemiological quantities: transmission rate (sigmoid activation), contact rate (softplus activation), and basic reproduction number $R_0$ (softplus activation). To enforce epidemiological consistency, the constraint $R_0 = \frac{\text{contact rate} \times \text{transmission rate}}{\text{recovery rate}}$ is incorporated as a penalty term during training.
  • Figure 2: Predicted epidemic curves showing susceptible, infected, and recovered populations over 40 days for a representative subset of parameter simulations. Credible intervals indicate model uncertainty.
  • Figure 3: Comparative analysis of predictive bias and coverage between ABC and DeepIMC methods over a 60-day forecast period. The DeepIMC model consistently demonstrates lower bias and superior coverage accuracy.
  • Figure 4: Performance of the DeepIMC model trained on ABM-generated epidemiological time-series data in predicting transmission probability ($p_{\text{tran}}$), contact rate ($c_{\text{rate}}$), and basic reproduction number ($R_0$). Scatter plots compare predicted versus actual values for each parameter, with the red dashed line indicating the ideal 1:1 correspondence. The Mean Absolute Errors (MAEs) are reported in parentheses for each metric.
  • Figure 5: Violin and box plots showing the distribution of parameter estimation bias for ABC and DeepIMC methods across three key epidemiological parameters: Contact Rate, Transmission Rate, and Basic Reproduction Number ($R_0$).
  • ...and 1 more figures