An early warning indicator trained on stochastic disease-spreading models with different noises

Amit K. Chakraborty; Shan Gao; Reza Miry; Pouria Ramazi; Russell Greiner; Mark A. Lewis; Hao Wang

An early warning indicator trained on stochastic disease-spreading models with different noises

Amit K. Chakraborty, Shan Gao, Reza Miry, Pouria Ramazi, Russell Greiner, Mark A. Lewis, Hao Wang

TL;DR

This work tackles early warning of disease outbreaks under stochastic dynamics by training a CNN-LSTM DL ensemble on noise-induced SIR/SEIR models that incorporate additive white, multiplicative environmental, and demographic noise. The SIDATR-family classifiers, especially SIDATR-500, generally outperform traditional EWIs (variance, lag-1 autocorrelation) across simulated noise models and can detect impending transcritical transitions, while PODATR shows model-specific strengths. Tests on Edmonton COVID-19 data reveal challenges in real-world applicability due to short series length, yet SIDATR-100 demonstrates reasonable predictive power when data length is limited. Overall, training on noise-rich synthetic data enhances EWS capability and provides a promising avenue for rapid outbreak anticipation, with practical implications for public health preparedness and response.

Abstract

The timely detection of disease outbreaks through reliable early warning signals (EWSs) is indispensable for effective public health mitigation strategies. Nevertheless, the intricate dynamics of real-world disease spread, often influenced by diverse sources of noise and limited data in the early stages of outbreaks, pose a significant challenge in developing reliable EWSs, as the performance of existing indicators varies with extrinsic and intrinsic noises. Here, we address the challenge of modeling disease when the measurements are corrupted by additive white noise, multiplicative environmental noise, and demographic noise into a standard epidemic mathematical model. To navigate the complexities introduced by these noise sources, we employ a deep learning algorithm that provides EWS in infectious disease outbreak by training on noise-induced disease-spreading models. The indicator's effectiveness is demonstrated through its application to real-world COVID-19 cases in Edmonton and simulated time series derived from diverse disease spread models affected by noise. Notably, the indicator captures an impending transition in a time series of disease outbreaks and outperforms existing indicators. This study contributes to advancing early warning capabilities by addressing the intricate dynamics inherent in real-world disease spread, presenting a promising avenue for enhancing public health preparedness and response efforts.

An early warning indicator trained on stochastic disease-spreading models with different noises

TL;DR

Abstract

Paper Structure (17 sections, 13 equations, 3 figures, 1 table)

This paper contains 17 sections, 13 equations, 3 figures, 1 table.

Introduction
Mathematical models and the impacts of noises
Noise-induced models
SIR and SEIR models with additive white noise
SIR model with multiplicative environmental noise
SIR model with demographic noise
Methods
Simulated training data
Deep learning algorithm and training
Testing
Mathematical models
Empirical data
Prediction and ROC curve
Results
Performance on mathematical models
...and 2 more sections

Figures (3)

Figure 1: (a) Linear increase samples of $\beta(t)$ for null (green) and transcritical (purple) simulations. A transition occurs when $\beta$ crosses the critical value $\beta_c$ (brown horizontal line), with the transition time marked by the intersection of the two lines (red dotted vertical lines). In transcritical simulations, $\beta$ crosses $\beta_c$ randomly between time 0 and 1500, while in null simulations, $\beta$ does not cross $\beta_c$ before time 1500. (b) If a transition occurs between time 0 and 1500, the number of infected of the preceding 500 (100) points leading up to the bifurcation point are utilized as training data for transcritical simulations. (c) In the absence of a transition during the period from time 0 to 1500, the number of infected of the most recent 500 (100) time points are selected as training data for null simulations.
Figure 2: Area under the ROC curve of the generic EWIs--variance, and lag-1 AC--in addition to the PODATR, SIDATR-500 and SIDATR-100 DL model. Performance was assessed on the last five predictions of transcritical and null simulations of the SIR model with white noise (yellow), SIR model with multiplicative environmental noise (green), SIR model with demographic noise (cyan), SEIR model with white noise (brown), and COVID-19 dataset of Edmonton (blue).
Figure 3: Frequency distribution of the favored probability among transcritical (blue) and null (orange) simulations generated by the (a) SIDATR-100, (b) SIDATR-500, and (b) PODATR model. In each input simulation, SIDATR-100 and SIDATR-500 models assign probabilities for transcritical and null bifurcations, while PODATR assigns probabilities for fold, Hopf, transcritical, and null bifurcations. Based on the last five favored probabilities in each simulation, a total of 50 frequencies from the transcritical simulations and another 50 frequencies from the null simulations were extracted from each mathematical model. Additionally, for the COVID-19 data, a total of 35 frequencies of each type were extracted.

An early warning indicator trained on stochastic disease-spreading models with different noises

TL;DR

Abstract

An early warning indicator trained on stochastic disease-spreading models with different noises

Authors

TL;DR

Abstract

Table of Contents

Figures (3)