Forecasting infectious disease prevalence with associated uncertainty using neural networks
Michael Morris
TL;DR
This work addresses the challenge of forecasting infectious disease prevalence with reliable uncertainty estimates. It presents two complementary approaches: Bayesian neural networks that incorporate Web search data to forecast influenza-like illness (ILI) and provide uncertainty via Bayesian layers, and neural ordinary differential equations (neural ODEs) that fuse mechanistic compartmental models with neural augmentation under a universal differential equations framework. The IRNN architecture emerges as the strongest performer among the NN variants, achieving notable reductions in MAE and improvements in Skill relative to state-of-the-art baselines, while IRNN_s enhances calibration of uncertainty at some cost to mean accuracy. In parallel, the thesis investigates neural ODE-based hybrids (UDEs) for disease forecasting, showing that mechanistic priors can guide learning but that integrating Web search data with ODE-based models remains challenging. Overall, the work demonstrates that uncertainty-aware neural models, especially those leveraging exogenous data, offer competitive forecasts and calibrated predictive intervals, with practical implications for public health decision-making and epidemic surveillance.
Abstract
Infectious diseases pose significant human and economic burdens. Accurately forecasting disease incidence can enable public health agencies to respond effectively to existing or emerging diseases. Despite progress in the field, developing accurate forecasting models remains a significant challenge. This thesis proposes two methodological frameworks using neural networks (NNs) with associated uncertainty estimates - a critical component limiting the application of NNs to epidemic forecasting thus far. We develop our frameworks by forecasting influenza-like illness (ILI) in the United States. Our first proposed method uses Web search activity data in conjunction with historical ILI rates as observations for training NN architectures. Our models incorporate Bayesian layers to produce uncertainty intervals, positioning themselves as legitimate alternatives to more conventional approaches. The best performing architecture: iterative recurrent neural network (IRNN), reduces mean absolute error by 10.3% and improves Skill by 17.1% on average in forecasting tasks across four flu seasons compared to the state-of-the-art. We build on this method by introducing IRNNs, an architecture which changes the sampling procedure in the IRNN to improve the uncertainty estimation. Our second framework uses neural ordinary differential equations to bridge the gap between mechanistic compartmental models and NNs; benefiting from the physical constraints that compartmental models provide. We evaluate eight neural ODE models utilising a mixture of ILI rates and Web search activity data to provide forecasts. These are compared with the IRNN and IRNN0 - the IRNN using only ILI rates. Models trained without Web search activity data outperform the IRNN0 by 16% in terms of Skill. Future work should focus on more effectively using neural ODEs with Web search data to compete with the best performing IRNN.
