Table of Contents
Fetching ...

Epidemiological Model Calibration via Graybox Bayesian Optimization

Puhua Niu, Byung-Jun Yoon, Xiaoning Qian

TL;DR

The paper tackles the challenge of calibrating expensive epidemiological simulators by introducing a graybox Bayesian optimization framework that exploits known SIQR dynamics through per-compartment Gaussian process surrogates and a function-network structure. A composite objective $g(\mathbf{y}(x))=\frac{1}{T}\sum_i (d_i- y_i(x))^2$ guides calibration, while a Knowledge-Gradient-based acquisition with decoupled decisions leverages incomplete observations to improve efficiency. Across simulated and real COVID-19 data, graybox BO variants (notably KG-CF, KG-FN, and DG-CF) show faster convergence and lower $\log(\mathrm{MSE})$ than traditional blackbox approaches, and decoupled acquisition frequently yields additional gains, especially under data missingness. These results suggest that incorporating model structure and observation patterns into BO can enable fast, accurate calibration of complex epidemic models, with potential extension to agent-based models and larger, interconnected sub-systems.

Abstract

In this study, we focus on developing efficient calibration methods via Bayesian decision-making for the family of compartmental epidemiological models. The existing calibration methods usually assume that the compartmental model is cheap in terms of its output and gradient evaluation, which may not hold in practice when extending them to more general settings. Therefore, we introduce model calibration methods based on a "graybox" Bayesian optimization (BO) scheme, more efficient calibration for general epidemiological models. This approach uses Gaussian processes as a surrogate to the expensive model, and leverages the functional structure of the compartmental model to enhance calibration performance. Additionally, we develop model calibration methods via a decoupled decision-making strategy for BO, which further exploits the decomposable nature of the functional structure. The calibration efficiencies of the multiple proposed schemes are evaluated based on various data generated by a compartmental model mimicking real-world epidemic processes, and real-world COVID-19 datasets. Experimental results demonstrate that our proposed graybox variants of BO schemes can efficiently calibrate computationally expensive models and further improve the calibration performance measured by the logarithm of mean square errors and achieve faster performance convergence in terms of BO iterations. We anticipate that the proposed calibration methods can be extended to enable fast calibration of more complex epidemiological models, such as the agent-based models.

Epidemiological Model Calibration via Graybox Bayesian Optimization

TL;DR

The paper tackles the challenge of calibrating expensive epidemiological simulators by introducing a graybox Bayesian optimization framework that exploits known SIQR dynamics through per-compartment Gaussian process surrogates and a function-network structure. A composite objective guides calibration, while a Knowledge-Gradient-based acquisition with decoupled decisions leverages incomplete observations to improve efficiency. Across simulated and real COVID-19 data, graybox BO variants (notably KG-CF, KG-FN, and DG-CF) show faster convergence and lower than traditional blackbox approaches, and decoupled acquisition frequently yields additional gains, especially under data missingness. These results suggest that incorporating model structure and observation patterns into BO can enable fast, accurate calibration of complex epidemic models, with potential extension to agent-based models and larger, interconnected sub-systems.

Abstract

In this study, we focus on developing efficient calibration methods via Bayesian decision-making for the family of compartmental epidemiological models. The existing calibration methods usually assume that the compartmental model is cheap in terms of its output and gradient evaluation, which may not hold in practice when extending them to more general settings. Therefore, we introduce model calibration methods based on a "graybox" Bayesian optimization (BO) scheme, more efficient calibration for general epidemiological models. This approach uses Gaussian processes as a surrogate to the expensive model, and leverages the functional structure of the compartmental model to enhance calibration performance. Additionally, we develop model calibration methods via a decoupled decision-making strategy for BO, which further exploits the decomposable nature of the functional structure. The calibration efficiencies of the multiple proposed schemes are evaluated based on various data generated by a compartmental model mimicking real-world epidemic processes, and real-world COVID-19 datasets. Experimental results demonstrate that our proposed graybox variants of BO schemes can efficiently calibrate computationally expensive models and further improve the calibration performance measured by the logarithm of mean square errors and achieve faster performance convergence in terms of BO iterations. We anticipate that the proposed calibration methods can be extended to enable fast calibration of more complex epidemiological models, such as the agent-based models.

Paper Structure

This paper contains 12 sections, 3 theorems, 12 equations, 9 figures, 1 table, 2 algorithms.

Key Result

Lemma 1

Let $G_n^i(x)= \overline{\mathbb{E}}_{n,x}[ \overline{\mathbb{E}}_{n+1}[g(y'_i(x'_\ast,Pa'_i))]-\overline{\mathbb{E}}_n[g(y_i(x_\ast,Pa_i))]$, where $x_\ast=\mathrm{argmax}_{x}\overline{\mathbb{E}}_{n}[g(y_i(x,Pa_i))]$, and $x'_\ast=\mathrm{argmax}_{x'}\overline{\mathbb{E}}_{n+1}[g(y'_i(x',Pa'_i))]$

Figures (9)

  • Figure 1: Diagram depicting an SIQR compartmental epidemiological model. Each arrow indicates that the population rise of the ending compartment is caused by the population decrease of the starting compartment.
  • Figure 2: Function network structure of the SIQR model calibration, where the two bold arrows indicate that $x$ and $g$ are the parent node and child node of $y_{1:4}$ respectively.
  • Figure 3: Schematic illustration of the model calibration workflow based on graybox BO, where we leverage expert knowledge about functional dependency and metric function.
  • Figure 4: The simulated ground-truth population fraction trajectories of each compartment for 30 days. (a) Trajectories from a SIQR model with linear derivative functions. (b) Trajectories corresponding to the case when derivation function $\lambda^\ast(t)$ is non-linear. (c) The trajectory of $\lambda^\ast(t)$ when it is set to be non-linear.
  • Figure 5: Calibration performance. The logarithm of the MSE is shown with respect to the number of BO iterations. For the $n_{\text{th}}$ iteration, the MSE is computed as $-\sum_t f^t(x)$, where $x=\mathrm{argmax}_x \hat{u}_n(x)$. Solid lines show the mean values of five runs and the shaded regions correspond to the standard deviations around the means. (a) Performance for a linear SIQR model $\zeta$. (b) Performance for a linear $\zeta$ in the presence of noise.(c) Performance for a non-linear $\zeta$.
  • ...and 4 more figures

Theorems & Definitions (5)

  • Lemma 1
  • proof
  • Lemma 2: Lemma B.3 in astudillo2021bayesian
  • Theorem 1
  • proof