Auditing the Fairness of the US COVID-19 Forecast Hub's Case Prediction Models
Saad Mohammad Abrar, Naman Awasthi, Daniel Smolyak, Vanessa Frias-Martinez
TL;DR
This paper evaluates fairness of county-level forecasts from the US COVID-19 Forecast Hub across race/ethnicity and urbanization, identifying substantial disparities in predictive errors for minority groups and rural counties. It adopts a regression-based framework that computes the forecast error via the pinball loss $PBL$ and fits a Gaussian GLM with a log link, employing 1% trimming of extreme values and GVIF-guided variable selection. It then analyzes interactions with Lookahead, Phase, Model Type, and Mobility to reveal context-dependent fairness, reporting that Hispanic counties incur higher errors relative to White baselines, Asian counties exhibit lower errors, and urbanicity-related disparities persist in less urban areas. Mobility data usage generally reduces disparities, deep-learning and ensemble models show more balanced fairness, and the authors provide an interactive dashboard and fairness nutritional cards to aid decision-makers and advocate reporting fairness metrics alongside accuracy.
Abstract
The US COVID-19 Forecast Hub, a repository of COVID-19 forecasts from over 50 independent research groups, is used by the Centers for Disease Control and Prevention (CDC) for their official COVID-19 communications. As such, the Forecast Hub is a critical centralized resource to promote transparent decision making. While the Forecast Hub has provided valuable predictions focused on accuracy, there is an opportunity to evaluate model performance across social determinants such as race and urbanization level that have been known to play a role in the COVID-19 pandemic. In this paper, we carry out a comprehensive fairness analysis of the Forecast Hub model predictions and we show statistically significant diverse predictive performance across social determinants, with minority racial and ethnic groups as well as less urbanized areas often associated with higher prediction errors. We hope this work will encourage COVID-19 modelers and the CDC to report fairness metrics together with accuracy, and to reflect on the potential harms of the models on specific social groups and contexts.
