Table of Contents
Fetching ...

Forecasting infectious disease prevalence with associated uncertainty using neural networks

Michael Morris

TL;DR

This work addresses the challenge of forecasting infectious disease prevalence with reliable uncertainty estimates. It presents two complementary approaches: Bayesian neural networks that incorporate Web search data to forecast influenza-like illness (ILI) and provide uncertainty via Bayesian layers, and neural ordinary differential equations (neural ODEs) that fuse mechanistic compartmental models with neural augmentation under a universal differential equations framework. The IRNN architecture emerges as the strongest performer among the NN variants, achieving notable reductions in MAE and improvements in Skill relative to state-of-the-art baselines, while IRNN_s enhances calibration of uncertainty at some cost to mean accuracy. In parallel, the thesis investigates neural ODE-based hybrids (UDEs) for disease forecasting, showing that mechanistic priors can guide learning but that integrating Web search data with ODE-based models remains challenging. Overall, the work demonstrates that uncertainty-aware neural models, especially those leveraging exogenous data, offer competitive forecasts and calibrated predictive intervals, with practical implications for public health decision-making and epidemic surveillance.

Abstract

Infectious diseases pose significant human and economic burdens. Accurately forecasting disease incidence can enable public health agencies to respond effectively to existing or emerging diseases. Despite progress in the field, developing accurate forecasting models remains a significant challenge. This thesis proposes two methodological frameworks using neural networks (NNs) with associated uncertainty estimates - a critical component limiting the application of NNs to epidemic forecasting thus far. We develop our frameworks by forecasting influenza-like illness (ILI) in the United States. Our first proposed method uses Web search activity data in conjunction with historical ILI rates as observations for training NN architectures. Our models incorporate Bayesian layers to produce uncertainty intervals, positioning themselves as legitimate alternatives to more conventional approaches. The best performing architecture: iterative recurrent neural network (IRNN), reduces mean absolute error by 10.3% and improves Skill by 17.1% on average in forecasting tasks across four flu seasons compared to the state-of-the-art. We build on this method by introducing IRNNs, an architecture which changes the sampling procedure in the IRNN to improve the uncertainty estimation. Our second framework uses neural ordinary differential equations to bridge the gap between mechanistic compartmental models and NNs; benefiting from the physical constraints that compartmental models provide. We evaluate eight neural ODE models utilising a mixture of ILI rates and Web search activity data to provide forecasts. These are compared with the IRNN and IRNN0 - the IRNN using only ILI rates. Models trained without Web search activity data outperform the IRNN0 by 16% in terms of Skill. Future work should focus on more effectively using neural ODEs with Web search data to compete with the best performing IRNN.

Forecasting infectious disease prevalence with associated uncertainty using neural networks

TL;DR

This work addresses the challenge of forecasting infectious disease prevalence with reliable uncertainty estimates. It presents two complementary approaches: Bayesian neural networks that incorporate Web search data to forecast influenza-like illness (ILI) and provide uncertainty via Bayesian layers, and neural ordinary differential equations (neural ODEs) that fuse mechanistic compartmental models with neural augmentation under a universal differential equations framework. The IRNN architecture emerges as the strongest performer among the NN variants, achieving notable reductions in MAE and improvements in Skill relative to state-of-the-art baselines, while IRNN_s enhances calibration of uncertainty at some cost to mean accuracy. In parallel, the thesis investigates neural ODE-based hybrids (UDEs) for disease forecasting, showing that mechanistic priors can guide learning but that integrating Web search data with ODE-based models remains challenging. Overall, the work demonstrates that uncertainty-aware neural models, especially those leveraging exogenous data, offer competitive forecasts and calibrated predictive intervals, with practical implications for public health decision-making and epidemic surveillance.

Abstract

Infectious diseases pose significant human and economic burdens. Accurately forecasting disease incidence can enable public health agencies to respond effectively to existing or emerging diseases. Despite progress in the field, developing accurate forecasting models remains a significant challenge. This thesis proposes two methodological frameworks using neural networks (NNs) with associated uncertainty estimates - a critical component limiting the application of NNs to epidemic forecasting thus far. We develop our frameworks by forecasting influenza-like illness (ILI) in the United States. Our first proposed method uses Web search activity data in conjunction with historical ILI rates as observations for training NN architectures. Our models incorporate Bayesian layers to produce uncertainty intervals, positioning themselves as legitimate alternatives to more conventional approaches. The best performing architecture: iterative recurrent neural network (IRNN), reduces mean absolute error by 10.3% and improves Skill by 17.1% on average in forecasting tasks across four flu seasons compared to the state-of-the-art. We build on this method by introducing IRNNs, an architecture which changes the sampling procedure in the IRNN to improve the uncertainty estimation. Our second framework uses neural ordinary differential equations to bridge the gap between mechanistic compartmental models and NNs; benefiting from the physical constraints that compartmental models provide. We evaluate eight neural ODE models utilising a mixture of ILI rates and Web search activity data to provide forecasts. These are compared with the IRNN and IRNN0 - the IRNN using only ILI rates. Models trained without Web search activity data outperform the IRNN0 by 16% in terms of Skill. Future work should focus on more effectively using neural ODEs with Web search data to compete with the best performing IRNN.
Paper Structure (106 sections, 91 equations, 67 figures, 9 tables)

This paper contains 106 sections, 91 equations, 67 figures, 9 tables.

Figures (67)

  • Figure 1: Example linear model estimates Linear model estimate trained on 10 data points sampled from $y=3x+5+\mathcal{N}(0, 1)$. The trained model fit $y=2.23x+4.74$. The prediction is close to the data model and has no uncertainty estimate.
  • Figure 2: Example Bayesian linear model estimates Bayesian linear model estimates trained on 10 data points sampled from $y=3x+5+\mathcal{N}(0, 1)$. The posterior distribution over the parameters if given by $p(\bm{\Phi}|\mathbf{x},\mathbf{y}) = \mathcal{N}\left([4.74, 2.23],0.50000.32\right)$, five samples from the model are shown. The prediction has the same mean as the deterministic linear model. The model uncertainty can be measured by integrating out the posterior. If the model is evaluated outside the training range then the predictions will have a higher variance.
  • Figure 3: Combined uncertainty with changing training set size Uncertainty averaged for $x\in[-1,1]$. As more data is used the model uncertainty reduces and the data uncertainty becomes more accurate. In almost all cases the combined uncertainty is the best estimate.
  • Figure 4: Closed-form uncertainty estimates for $N=10$ training examples Uncertainty estimates for $N=10$ training examples, uncertainty intervals shown at one standard deviation from the mean. The model uncertainty is small for $x\in[-1,1]$ and grows for $x<-1$ and $x>1$ (out-of-sample). Data uncertainty is constant for all $x$. Combined uncertainty is the sum of the variances of model and data uncertainty.
  • Figure 5: Closed-form combined uncertainty with changing training set size Uncertainty averaged for $x\in[-1,1]$. As more data is used the model uncertainty reduces and the data uncertainty becomes more accurate. In almost all cases the combined uncertainty is the best estimate.
  • ...and 62 more figures