Table of Contents
Fetching ...

Systematic analysis of the effectiveness of adding human mobility data to covid-19 case prediction linear models

Saad Mohammad Abrar, Naman Awasthi, Daniel Smolyak, Vanessa Frias-Martinez

TL;DR

This paper addresses whether incorporating human mobility data enhances covid-19 case regression forecasts. It adopts a systematic, cross-dataset, lookahead evaluation using elastic net regression, comparing baseline models (past cases only) to mobility-augmented models across five prediction horizons and ten mobility datasets, measured by Spearman correlation between predicted and actual cases. The main finding is that mobility provides meaningful gains only for about two months near the onset of the testing period, with maximum improvements around $ci \leq 0.3$ and strongest effects for Apple and Google data; beyond this window, benefits are small or negative, highlighting cost-effectiveness considerations given data access. These results suggest limited long-term utility of mobility data for covid-19 forecasting and point to the need for alternative modeling approaches and careful data-use decision making.

Abstract

Human mobility data has been extensively used in covid-19 case prediction models. Nevertheless, related work has questioned whether mobility data really helps that much. We present a systematic analysis across mobility datasets and prediction lookaheads and reveal that adding mobility data to predictive models improves model performance only for about two months at the onset of the testing period, and that performance improvements -- measured as predicted vs. actual correlation improvement over non-mobility baselines -- are at most 0.3.

Systematic analysis of the effectiveness of adding human mobility data to covid-19 case prediction linear models

TL;DR

This paper addresses whether incorporating human mobility data enhances covid-19 case regression forecasts. It adopts a systematic, cross-dataset, lookahead evaluation using elastic net regression, comparing baseline models (past cases only) to mobility-augmented models across five prediction horizons and ten mobility datasets, measured by Spearman correlation between predicted and actual cases. The main finding is that mobility provides meaningful gains only for about two months near the onset of the testing period, with maximum improvements around and strongest effects for Apple and Google data; beyond this window, benefits are small or negative, highlighting cost-effectiveness considerations given data access. These results suggest limited long-term utility of mobility data for covid-19 forecasting and point to the need for alternative modeling approaches and careful data-use decision making.

Abstract

Human mobility data has been extensively used in covid-19 case prediction models. Nevertheless, related work has questioned whether mobility data really helps that much. We present a systematic analysis across mobility datasets and prediction lookaheads and reveal that adding mobility data to predictive models improves model performance only for about two months at the onset of the testing period, and that performance improvements -- measured as predicted vs. actual correlation improvement over non-mobility baselines -- are at most 0.3.
Paper Structure (8 sections, 1 equation, 3 figures, 1 table)

This paper contains 8 sections, 1 equation, 3 figures, 1 table.

Figures (3)

  • Figure 1: Sliding-window approach for train and test data split.
  • Figure 2: Spearman correlation improvement $ci$ of apple mobility models compared to baseline models. Each subplot shows five improvement trends, one for each lookahead $l$ on day $d$.
  • Figure 3: Spearman correlation improvement $ci$ of mobility models (apple, google, descartes, and safegraph inflows, intraflows, outflows, and safegraph POI (grocery stores, religious sites, restaurants, schools ) compared to baseline models. Each subplot shows five improvement trends, one for each lookahead $l$ on date $d$.