Systematic analysis of the effectiveness of adding human mobility data to covid-19 case prediction linear models

Saad Mohammad Abrar; Naman Awasthi; Daniel Smolyak; Vanessa Frias-Martinez

Systematic analysis of the effectiveness of adding human mobility data to covid-19 case prediction linear models

Saad Mohammad Abrar, Naman Awasthi, Daniel Smolyak, Vanessa Frias-Martinez

TL;DR

This paper addresses whether incorporating human mobility data enhances covid-19 case regression forecasts. It adopts a systematic, cross-dataset, lookahead evaluation using elastic net regression, comparing baseline models (past cases only) to mobility-augmented models across five prediction horizons and ten mobility datasets, measured by Spearman correlation between predicted and actual cases. The main finding is that mobility provides meaningful gains only for about two months near the onset of the testing period, with maximum improvements around $ci \leq 0.3$ and strongest effects for Apple and Google data; beyond this window, benefits are small or negative, highlighting cost-effectiveness considerations given data access. These results suggest limited long-term utility of mobility data for covid-19 forecasting and point to the need for alternative modeling approaches and careful data-use decision making.

Abstract

Human mobility data has been extensively used in covid-19 case prediction models. Nevertheless, related work has questioned whether mobility data really helps that much. We present a systematic analysis across mobility datasets and prediction lookaheads and reveal that adding mobility data to predictive models improves model performance only for about two months at the onset of the testing period, and that performance improvements -- measured as predicted vs. actual correlation improvement over non-mobility baselines -- are at most 0.3.

Systematic analysis of the effectiveness of adding human mobility data to covid-19 case prediction linear models

TL;DR

and strongest effects for Apple and Google data; beyond this window, benefits are small or negative, highlighting cost-effectiveness considerations given data access. These results suggest limited long-term utility of mobility data for covid-19 forecasting and point to the need for alternative modeling approaches and careful data-use decision making.

Abstract

Paper Structure (8 sections, 1 equation, 3 figures, 1 table)

This paper contains 8 sections, 1 equation, 3 figures, 1 table.

Introduction
Methods
Datasets
Predictive Models
Train and Test Methodology
Model Performance Analysis
Results
Discussion and Future Work

Figures (3)

Figure 1: Sliding-window approach for train and test data split.
Figure 2: Spearman correlation improvement $ci$ of apple mobility models compared to baseline models. Each subplot shows five improvement trends, one for each lookahead $l$ on day $d$.
Figure 3: Spearman correlation improvement $ci$ of mobility models (apple, google, descartes, and safegraph inflows, intraflows, outflows, and safegraph POI (grocery stores, religious sites, restaurants, schools ) compared to baseline models. Each subplot shows five improvement trends, one for each lookahead $l$ on date $d$.

Systematic analysis of the effectiveness of adding human mobility data to covid-19 case prediction linear models

TL;DR

Abstract

Systematic analysis of the effectiveness of adding human mobility data to covid-19 case prediction linear models

Authors

TL;DR

Abstract

Table of Contents

Figures (3)