Advancing Real-time Pandemic Forecasting Using Large Language Models: A COVID-19 Case Study

Hongru Du; Jianan Zhao; Yang Zhao; Shaochong Xu; Xihong Lin; Yiran Chen; Lauren M. Gardner; Hao Frank Yang

Advancing Real-time Pandemic Forecasting Using Large Language Models: A COVID-19 Case Study

Hongru Du, Jianan Zhao, Yang Zhao, Shaochong Xu, Xihong Lin, Yiran Chen, Lauren M. Gardner, Hao Frank Yang

TL;DR

PandemicLLM presents an open-source framework that reframes real-time pandemic forecasting as text reasoning using large language models to integrate multi-modal data including epidemiological time series, public health policy, genomic surveillance and demographics. It introduces an AI–human cooperative prompt design and a GRU-based temporal encoder to transform heterogeneous data into prompts for LLMs and form an ordinal 5-class hospitalization trend target with horizons of 1 and 3 weeks. In extensive experiments across all 50 U.S. states, PandemicLLM outperforms traditional baselines by at least 20% and demonstrates robust trustworthiness with confidence based evaluation. The results highlight the potential to adapt LLM based forecasting to real-time public health decision making and extendable to other diseases and scales, including zero-shot adaptation to emerging variants.

Abstract

Forecasting the short-term spread of an ongoing disease outbreak is a formidable challenge due to the complexity of contributing factors, some of which can be characterized through interlinked, multi-modality variables such as epidemiological time series data, viral biology, population demographics, and the intersection of public policy and human behavior. Existing forecasting model frameworks struggle with the multifaceted nature of relevant data and robust results translation, which hinders their performances and the provision of actionable insights for public health decision-makers. Our work introduces PandemicLLM, a novel framework with multi-modal Large Language Models (LLMs) that reformulates real-time forecasting of disease spread as a text reasoning problem, with the ability to incorporate real-time, complex, non-numerical information that previously unattainable in traditional forecasting models. This approach, through a unique AI-human cooperative prompt design and time series representation learning, encodes multi-modal data for LLMs. The model is applied to the COVID-19 pandemic, and trained to utilize textual public health policies, genomic surveillance, spatial, and epidemiological time series data, and is subsequently tested across all 50 states of the U.S. Empirically, PandemicLLM is shown to be a high-performing pandemic forecasting framework that effectively captures the impact of emerging variants and can provide timely and accurate predictions. The proposed PandemicLLM opens avenues for incorporating various pandemic-related data in heterogeneous formats and exhibits performance benefits over existing models. This study illuminates the potential of adapting LLMs and representation learning to enhance pandemic forecasting, illustrating how AI innovations can strengthen pandemic responses and crisis management in the future.

Advancing Real-time Pandemic Forecasting Using Large Language Models: A COVID-19 Case Study

TL;DR

Abstract

Paper Structure (36 sections, 13 equations, 11 figures, 3 tables)

This paper contains 36 sections, 13 equations, 11 figures, 3 tables.

Introduction
Novelties and contributions
Data and Methods
Multi-modality pandemic data
The PandemicLLM framework
Pandemic forecasting as ordinal classification
AI-human cooperative prompt design
Experiment setup
Evaluation of PandemicLLM and reference models
Results
COVID-19 hospitalization trend prediction
Spatial performance evaluation
Comparison to reference models
Trustworthy and robust results
Integrating real-time genomic surveillance information for timely response
...and 21 more sections

Figures (11)

Figure 1: The overview of PandemicLLMs' pandemic data streams and pipeline.(a) Multi-modality data insights into Pandemic. Our multi-modality dataset integrates four types of pandemic data sources: spatial, epidemiological time series, public health policy, and genomic surveillance data. Spatial data includes demographic and healthcare indicators, whereas the epidemiological time series aspect covers reported cases, hospitalizations, and vaccination rates. Data about policy detail governmental interventions in a textual format, and the genomic surveillance data integrates textual descriptions of variants with weekly sequences regarding their prevalence. The data comprises 5,200 records, covering all 50 U.S. states over 104 weeks. The phylogenetic tree of SARS-CoV-2 was generated using Nextstrainhadfield2018nextstrain. (b) PandemicLLMs' construction pipeline. To forecast pandemic hospitalization trends, we formulate the problem as an ordinal classification task. We define five categories following CDC guidancecdc_tracker: Substantial Decrease, Moderate Decrease, Stable, Moderate Increase, and Substantial Increase. By converting multi-modality data into a text format through AI-human cooperative prompt design, PandemicLLMs are fine-tuned with these prompts and targets for 1-week and 3-week forecasts. We emphasize rigorous performance assessment to verify the accuracy and trustworthiness of our predictions.
Figure 2: Summary of the AI-human cooperative prompt design. Spatial data for all 50 U.S. states are converted into descriptions to reflect their rankings; the policy data includes stringency levels and changes from week-to-week. Epidemiological time series data uses both narrative generation and representation learning. Genomic surveillance data combines textual summaries of variant characteristics with recent prevalence. The blue arrow indicates the information textualization, while the red arrow indicates the sequence representation learning. Each designed prompt has 296 to 322 words.
Figure 3: PandemicLLMs' predictions visualization and performance evaluation.(a), 1-week and 3-week predictions by PandemicLLMs versus the ground truth targets. Color indicates Hospitalization Trend Category (HTC): SD: Substantial Decrease, MD: Moderate Decrease, ST: Stable, MI: Moderate Increase, SI: Substantial Increase. (b, c), 1-week and 3-week performance for PandemicLLM-7B. (d, e), 1-week and 3-week performance for PandemicLLM-13B. The color gradients represent the magnitude of the WMSE, where a darker shade of red signifies a greater error, and a darker shade of blue denotes a smaller error. Equivalent evaluations with alternative error metrics are included in the Supplementary Information section 9. (f, g), Performances comparison of PandemicLLMs with reference models across time. The red curve on the back represents the weekly reported COVID-19 hospital admission at the national level. The left y-axis represents the scale of WMSE, and the right y-axis represents the scale of hospital admission. Each set of bar graphs in the figure represents the distribution of WMSE for all states during a specific week. The color bars represent the error distributions for different models. (f), 1-week forecasting performance. (g), 3-week forecasting performance.
Figure 4: Trustworthiness for PandemicLLMs.(a, b) The accuracy of 1-week and 3-week predictions varied across various levels of prediction confidence. (c, d) The 1-week and 3-week confusion matrix for PandemicLLM-7B. (e, f) The 1-week and 3-week precision confusion matrix for PandemicLLM-13B. SD: Substantial Decrease, MD: Moderate Decrease, ST: Stable, MI: Moderate Increase, SI: Substantial Increase.
Figure 5: A comparative analysis with and without the real-time genomic surveillance information.(a) National estimates of weekly proportions of SARS-CoV-2 variants from September, 2022 to January, 2023. (b) Comparison of models' performance with and without real-time genomic surveillance information (w/o GSI). (c) Prediction confidence of PandemicLLMs across time. The dash lines represent the models without real-time genomic surveillance information.
...and 6 more figures

Advancing Real-time Pandemic Forecasting Using Large Language Models: A COVID-19 Case Study

TL;DR

Abstract

Advancing Real-time Pandemic Forecasting Using Large Language Models: A COVID-19 Case Study

Authors

TL;DR

Abstract

Table of Contents

Figures (11)