GSA-Forecaster: Forecasting Graph-Based Time-Dependent Data with Graph Sequence Attention
Yang Li, Di Wang, José M. F. Moura
TL;DR
The paper addresses forecasting graph-based time-dependent data by introducing GSA-Forecaster, which employs Graph Sequence Attention to capture temporal dependencies via temporal neighborhoods, while integrating spatial structure through sparse graph-aligned layers. It also provides a graph identification mechanism using Gaussian Markov random fields when the graph is not given, and includes an optional node-level attention extension and a GRU-based recent-trend module for non-stationary scenarios. Extensive experiments on NYC Taxi, PEMS-BAY, ECL, and Traffic demonstrate consistent outperformance over state-of-the-art models (e.g., Forecaster, DCRNN, Graph WaveNet, Crossformer), with ablations confirming the importance of temporal neighborhoods, auxiliary information, GRU, and positional encoding. The approach achieves robust, scalable forecasting across multiple domains, offering practical impact for real-world traffic, energy, and mobility forecasting tasks.
Abstract
Forecasting graph-based, time-dependent data has broad practical applications but presents challenges. Effective models must capture both spatial and temporal dependencies in the data, while also incorporating auxiliary information to enhance prediction accuracy. In this paper, we identify limitations in current state-of-the-art models regarding temporal dependency handling. To overcome this, we introduce GSA-Forecaster, a new deep learning model designed for forecasting in graph-based, time-dependent contexts. GSA-Forecaster utilizes graph sequence attention, a new attention mechanism proposed in this paper, to effectively manage temporal dependencies. GSA-Forecaster integrates the data's graph structure directly into its architecture, addressing spatial dependencies. Additionally, it incorporates auxiliary information to refine its predictions further. We validate its performance using real-world graph-based, time-dependent datasets, where it demonstrates superior effectiveness compared to existing state-of-the-art models.
