Table of Contents
Fetching ...

MAT-MPNN: A Mobility-Aware Transformer-MPNN Model for Dynamic Spatiotemporal Prediction of HIV Diagnoses in California, Florida, and New England

Zhaoxuan Wang, Weichen Kang, Yutian Han, Lingyuan Zhao, Bo Li

TL;DR

This study addresses the challenge of predicting county-level HIV diagnoses by introducing MAT-MPNN, a Mobility-Aware Transformer–MPNN framework that couples a Transformer encoder for long-range temporal dynamics with a Mobility Graph Generator that constructs time-varying, mobility-informed adjacencies. The adjacency is blended with static geographic connections via a learnable parameter, enabling dynamic spatial coupling that captures noncontiguous interactions. Compared with baselines like Transformer–MPNN and SVAR, MAT-MPNN substantially improves predictive accuracy and calibration (e.g., MSPE reductions of up to 39.1% and CRPS improvements) across California, Florida, and New England, and yields well-calibrated predictive intervals. The approach offers a flexible, mobility-informed paradigm for spatiotemporal epidemiology with potential applicability to other infectious diseases and health indicators.

Abstract

Human Immunodeficiency Virus (HIV) has posed a major global health challenge for decades, and forecasting HIV diagnoses continues to be a critical area of research. However, capturing the complex spatial and temporal dependencies of HIV transmission remains challenging. Conventional Message Passing Neural Network (MPNN) models rely on a fixed binary adjacency matrix that only encodes geographic adjacency, which is unable to represent interactions between non-contiguous counties. Our study proposes a deep learning architecture Mobility-Aware Transformer-Message Passing Neural Network (MAT-MPNN) framework to predict county-level HIV diagnosis rates across California, Florida, and the New England region. The model combines temporal features extracted by a Transformer encoder with spatial relationships captured through a Mobility Graph Generator (MGG). The MGG improves conventional adjacency matrices by combining geographic and demographic information. Compared with the best-performing hybrid baseline, the Transformer MPNN model, MAT-MPNN reduced the Mean Squared Prediction Error (MSPE) by 27.9% in Florida, 39.1% in California, and 12.5% in New England, and improved the Predictive Model Choice Criterion (PMCC) by 7.7%, 3.5%, and 3.9%, respectively. MAT-MPNN also achieved better results than the Spatially Varying Auto-Regressive (SVAR) model in Florida and New England, with comparable performance in California. These results demonstrate that applying mobility-aware dynamic spatial structures substantially enhances predictive accuracy and calibration in spatiotemporal epidemiological prediction.

MAT-MPNN: A Mobility-Aware Transformer-MPNN Model for Dynamic Spatiotemporal Prediction of HIV Diagnoses in California, Florida, and New England

TL;DR

This study addresses the challenge of predicting county-level HIV diagnoses by introducing MAT-MPNN, a Mobility-Aware Transformer–MPNN framework that couples a Transformer encoder for long-range temporal dynamics with a Mobility Graph Generator that constructs time-varying, mobility-informed adjacencies. The adjacency is blended with static geographic connections via a learnable parameter, enabling dynamic spatial coupling that captures noncontiguous interactions. Compared with baselines like Transformer–MPNN and SVAR, MAT-MPNN substantially improves predictive accuracy and calibration (e.g., MSPE reductions of up to 39.1% and CRPS improvements) across California, Florida, and New England, and yields well-calibrated predictive intervals. The approach offers a flexible, mobility-informed paradigm for spatiotemporal epidemiology with potential applicability to other infectious diseases and health indicators.

Abstract

Human Immunodeficiency Virus (HIV) has posed a major global health challenge for decades, and forecasting HIV diagnoses continues to be a critical area of research. However, capturing the complex spatial and temporal dependencies of HIV transmission remains challenging. Conventional Message Passing Neural Network (MPNN) models rely on a fixed binary adjacency matrix that only encodes geographic adjacency, which is unable to represent interactions between non-contiguous counties. Our study proposes a deep learning architecture Mobility-Aware Transformer-Message Passing Neural Network (MAT-MPNN) framework to predict county-level HIV diagnosis rates across California, Florida, and the New England region. The model combines temporal features extracted by a Transformer encoder with spatial relationships captured through a Mobility Graph Generator (MGG). The MGG improves conventional adjacency matrices by combining geographic and demographic information. Compared with the best-performing hybrid baseline, the Transformer MPNN model, MAT-MPNN reduced the Mean Squared Prediction Error (MSPE) by 27.9% in Florida, 39.1% in California, and 12.5% in New England, and improved the Predictive Model Choice Criterion (PMCC) by 7.7%, 3.5%, and 3.9%, respectively. MAT-MPNN also achieved better results than the Spatially Varying Auto-Regressive (SVAR) model in Florida and New England, with comparable performance in California. These results demonstrate that applying mobility-aware dynamic spatial structures substantially enhances predictive accuracy and calibration in spatiotemporal epidemiological prediction.

Paper Structure

This paper contains 19 sections, 19 equations, 9 figures, 1 table.

Figures (9)

  • Figure 2.3.1: County level 2022 new HIV diagnosis rates in cases per 100,000 across the USA
  • Figure 3.1.1: Overall architecture of the proposed MAT–MPNN framework. The model integrates temporal learning via a Transformer encoder, adaptive graph construction via the Mobility Graph Generator (MGG), and spatial message passing via the MPNN to jointly capture temporal evolution and spatial diffusion of HIV diagnoses.
  • Figure 3.3.1: Temporal–spatial encoding pipeline using California counties as an example. Each county provides temporal and spatial covariates, which are linearly projected and processed through a Transformer encoder to produce temporally contextualized embeddings $H$. Both $H$ and $S$ are duplicated and combined element-wise to form the unified representation $X = \tilde{H} + \tilde{S}$.
  • Figure 3.4.1: The MGG constructs time-varying adjacency matrices by encoding county embeddings, then score and choose the most important edges, normalizing their connection strengths, and integrating them with static geographic borders to capture both dynamic and static spatial dependencies.
  • Figure 4.1.1: Training and validation loss curves of the MAT-MPNN for California (left), Florida (middle), and New England (right) across training epochs. The curves show smooth convergence and comparable train-validation behavior, indicating stable model fitting without severe overfitting.
  • ...and 4 more figures