Table of Contents
Fetching ...

Investigating Forecasting Models for Pandemic Infections Using Heterogeneous Data Sources: A 2-year Study with COVID-19

Zacharias Komodromos, Kleanthis Malialis, Panayiotis Kolios

TL;DR

The paper addresses near-term COVID-19 infection forecasting in a data-rich, multi-source setting. It leverages two forecasting approaches, XGBoost and ARIMAX, trained on a Cyprus case study spanning two years and integrating epidemiological, vaccination, policy, and weather data. Key findings show that infection-related features are central to predictive performance, external signals such as policy and weather provide additional gains, and vaccination signals have limited near-term power; horizon effects differ by regime, with XGBoost performing better during waves and ARIMAX during non-wave periods. The work advances pandemic preparedness by demonstrating how heterogeneous data fusion and careful feature selection can improve forecast accuracy in a real-world setting and offers generalizable insights for similar regions.

Abstract

Emerging in December 2019, the COVID-19 pandemic caused widespread health, economic, and social disruptions. Rapid global transmission overwhelmed healthcare systems, resulting in high infection rates, hospitalisations, and fatalities. To minimise the spread, governments implemented several non-pharmaceutical interventions like lockdowns and travel restrictions. While effective in controlling transmission, these measures also posed significant economic and societal challenges. Although the WHO declared COVID-19 no longer a global health emergency in May 2023, its impact persists, shaping public health strategies. The vast amount of data collected during the pandemic offers valuable insights into disease dynamics, transmission, and intervention effectiveness. Leveraging these insights can improve forecasting models, enhancing preparedness and response to future outbreaks while mitigating their social and economic impact. This paper presents a large-scale case study on COVID-19 forecasting in Cyprus, utilising a two-year dataset that integrates epidemiological data, vaccination records, policy measures, and weather conditions. We analyse infection trends, assess forecasting performance, and examine the influence of external factors on disease dynamics. The insights gained contribute to improved pandemic preparedness and response strategies.

Investigating Forecasting Models for Pandemic Infections Using Heterogeneous Data Sources: A 2-year Study with COVID-19

TL;DR

The paper addresses near-term COVID-19 infection forecasting in a data-rich, multi-source setting. It leverages two forecasting approaches, XGBoost and ARIMAX, trained on a Cyprus case study spanning two years and integrating epidemiological, vaccination, policy, and weather data. Key findings show that infection-related features are central to predictive performance, external signals such as policy and weather provide additional gains, and vaccination signals have limited near-term power; horizon effects differ by regime, with XGBoost performing better during waves and ARIMAX during non-wave periods. The work advances pandemic preparedness by demonstrating how heterogeneous data fusion and careful feature selection can improve forecast accuracy in a real-world setting and offers generalizable insights for similar regions.

Abstract

Emerging in December 2019, the COVID-19 pandemic caused widespread health, economic, and social disruptions. Rapid global transmission overwhelmed healthcare systems, resulting in high infection rates, hospitalisations, and fatalities. To minimise the spread, governments implemented several non-pharmaceutical interventions like lockdowns and travel restrictions. While effective in controlling transmission, these measures also posed significant economic and societal challenges. Although the WHO declared COVID-19 no longer a global health emergency in May 2023, its impact persists, shaping public health strategies. The vast amount of data collected during the pandemic offers valuable insights into disease dynamics, transmission, and intervention effectiveness. Leveraging these insights can improve forecasting models, enhancing preparedness and response to future outbreaks while mitigating their social and economic impact. This paper presents a large-scale case study on COVID-19 forecasting in Cyprus, utilising a two-year dataset that integrates epidemiological data, vaccination records, policy measures, and weather conditions. We analyse infection trends, assess forecasting performance, and examine the influence of external factors on disease dynamics. The insights gained contribute to improved pandemic preparedness and response strategies.

Paper Structure

This paper contains 18 sections, 5 equations, 7 figures, 10 tables.

Figures (7)

  • Figure 1: Daily COVID-19 cases in Cyprus (01/10/2020 to 31/12/2022), highlighting five distinct waves.
  • Figure 2: Time series of (1) weekly rolling average of deaths, (2) total of hospitalised people due to COVID-19 per day, and (3) total people in ICU due to COVID-19 per day.
  • Figure 3: Time series of weekly vaccinations per number of dose.
  • Figure 4: Time series of daily policy indices. In order: General Stringency Index, Workplace Closing, and Facial Coverings.
  • Figure 5: Weekly rolling averages of average daily temperature and average daily wind-speed.
  • ...and 2 more figures