Early warning of Mpox outbreaks in U.S. jurisdictions using Lasso Vector Autoregression models with cross-jurisdictional lags
Hannah Craddock, Joel O. Wertheim, Eliah Aronoff-Spencer, Mark Beatty, David Valentine, Rishi Graham, Jade C. Wang, Lior Rennert, Seema Shah, Ravi Goyal, Natasha K. Martin
TL;DR
Mpox exhibits episodic, spatially heterogeneous transmission, motivating area-specific forecasts. The authors deploy a sparse VAR framework with cross-jurisdictional lags (VAR-Lasso) to generate rolling two-week-ahead forecasts for eight high-incidence U.S. jurisdictions and identify influential long-lag predictors. External phylogenetic validation in San Diego County aligns the model's cross-jurisdictional signals with observed genetic introductions, and slope-weighted evaluation shows VAR-Lasso consistently outperforms univariate AR-Lasso and naive benchmarks. This approach enables earlier warnings and targeted public health actions by leveraging inter-jurisdictional case dynamics alongside genomic evidence.
Abstract
Mpox is an orthopoxvirus that infects humans and animals and is transmitted primarily through close physical contact. The episodic and spatially heterogeneous dynamics of Mpox transmission underscores the need for timely, area-specific forecasts to support targeted public health responses in the U.S. We develop a Vector Autoregression model with Lasso regularization (VAR-Lasso) to generate rolling two-week-ahead forecasts of weekly Mpox cases for eight high-incidence U.S. jurisdictions using national surveillance data from the Centers for Disease Control and Prevention (CDC). The VAR-Lasso model identifies significant long-lag, cross-jurisdictional predictors. For a case study in San Diego County (SDC), these statistical predictors align with phylogenetic analysis that traces a 2023 cluster in SDC to an outbreak in Illinois six months earlier. As the need for public health action is often greatest when incidence is increasing, our performance evaluation focuses on positive-slope weighted error metrics. Forecast performance of the VAR-Lasso model is compared to a uni-variate Auto-Regressive (AR) Lasso model and a naive moving-average estimate. The models are compared using slope-weighted Root Mean Squared Error (RMSE), slope-weighted Mean Absolute Error (MAE), and slope-weighted bias. Across all observations, the VAR-Lasso model reduces slope-weighted RMSE, MAE, and bias by 12%, 7%, and 66% relative to the AR model, and by 16%, 13%, and 76% relative to the naive benchmark. Our findings highlight the value of sparse multivariate time-series models that leverage cross-jurisdictional case data for early forecasting of Mpox outbreaks. Such forecasting can aid health departments in proactively providing timely resources and messaging to mitigate the risks of a future outbreak.
