Epidemiological Model Calibration via Graybox Bayesian Optimization
Puhua Niu, Byung-Jun Yoon, Xiaoning Qian
TL;DR
The paper tackles the challenge of calibrating expensive epidemiological simulators by introducing a graybox Bayesian optimization framework that exploits known SIQR dynamics through per-compartment Gaussian process surrogates and a function-network structure. A composite objective $g(\mathbf{y}(x))=\frac{1}{T}\sum_i (d_i- y_i(x))^2$ guides calibration, while a Knowledge-Gradient-based acquisition with decoupled decisions leverages incomplete observations to improve efficiency. Across simulated and real COVID-19 data, graybox BO variants (notably KG-CF, KG-FN, and DG-CF) show faster convergence and lower $\log(\mathrm{MSE})$ than traditional blackbox approaches, and decoupled acquisition frequently yields additional gains, especially under data missingness. These results suggest that incorporating model structure and observation patterns into BO can enable fast, accurate calibration of complex epidemic models, with potential extension to agent-based models and larger, interconnected sub-systems.
Abstract
In this study, we focus on developing efficient calibration methods via Bayesian decision-making for the family of compartmental epidemiological models. The existing calibration methods usually assume that the compartmental model is cheap in terms of its output and gradient evaluation, which may not hold in practice when extending them to more general settings. Therefore, we introduce model calibration methods based on a "graybox" Bayesian optimization (BO) scheme, more efficient calibration for general epidemiological models. This approach uses Gaussian processes as a surrogate to the expensive model, and leverages the functional structure of the compartmental model to enhance calibration performance. Additionally, we develop model calibration methods via a decoupled decision-making strategy for BO, which further exploits the decomposable nature of the functional structure. The calibration efficiencies of the multiple proposed schemes are evaluated based on various data generated by a compartmental model mimicking real-world epidemic processes, and real-world COVID-19 datasets. Experimental results demonstrate that our proposed graybox variants of BO schemes can efficiently calibrate computationally expensive models and further improve the calibration performance measured by the logarithm of mean square errors and achieve faster performance convergence in terms of BO iterations. We anticipate that the proposed calibration methods can be extended to enable fast calibration of more complex epidemiological models, such as the agent-based models.
