Table of Contents
Fetching ...

GSA-Forecaster: Forecasting Graph-Based Time-Dependent Data with Graph Sequence Attention

Yang Li, Di Wang, José M. F. Moura

TL;DR

The paper addresses forecasting graph-based time-dependent data by introducing GSA-Forecaster, which employs Graph Sequence Attention to capture temporal dependencies via temporal neighborhoods, while integrating spatial structure through sparse graph-aligned layers. It also provides a graph identification mechanism using Gaussian Markov random fields when the graph is not given, and includes an optional node-level attention extension and a GRU-based recent-trend module for non-stationary scenarios. Extensive experiments on NYC Taxi, PEMS-BAY, ECL, and Traffic demonstrate consistent outperformance over state-of-the-art models (e.g., Forecaster, DCRNN, Graph WaveNet, Crossformer), with ablations confirming the importance of temporal neighborhoods, auxiliary information, GRU, and positional encoding. The approach achieves robust, scalable forecasting across multiple domains, offering practical impact for real-world traffic, energy, and mobility forecasting tasks.

Abstract

Forecasting graph-based, time-dependent data has broad practical applications but presents challenges. Effective models must capture both spatial and temporal dependencies in the data, while also incorporating auxiliary information to enhance prediction accuracy. In this paper, we identify limitations in current state-of-the-art models regarding temporal dependency handling. To overcome this, we introduce GSA-Forecaster, a new deep learning model designed for forecasting in graph-based, time-dependent contexts. GSA-Forecaster utilizes graph sequence attention, a new attention mechanism proposed in this paper, to effectively manage temporal dependencies. GSA-Forecaster integrates the data's graph structure directly into its architecture, addressing spatial dependencies. Additionally, it incorporates auxiliary information to refine its predictions further. We validate its performance using real-world graph-based, time-dependent datasets, where it demonstrates superior effectiveness compared to existing state-of-the-art models.

GSA-Forecaster: Forecasting Graph-Based Time-Dependent Data with Graph Sequence Attention

TL;DR

The paper addresses forecasting graph-based time-dependent data by introducing GSA-Forecaster, which employs Graph Sequence Attention to capture temporal dependencies via temporal neighborhoods, while integrating spatial structure through sparse graph-aligned layers. It also provides a graph identification mechanism using Gaussian Markov random fields when the graph is not given, and includes an optional node-level attention extension and a GRU-based recent-trend module for non-stationary scenarios. Extensive experiments on NYC Taxi, PEMS-BAY, ECL, and Traffic demonstrate consistent outperformance over state-of-the-art models (e.g., Forecaster, DCRNN, Graph WaveNet, Crossformer), with ablations confirming the importance of temporal neighborhoods, auxiliary information, GRU, and positional encoding. The approach achieves robust, scalable forecasting across multiple domains, offering practical impact for real-world traffic, energy, and mobility forecasting tasks.

Abstract

Forecasting graph-based, time-dependent data has broad practical applications but presents challenges. Effective models must capture both spatial and temporal dependencies in the data, while also incorporating auxiliary information to enhance prediction accuracy. In this paper, we identify limitations in current state-of-the-art models regarding temporal dependency handling. To overcome this, we introduce GSA-Forecaster, a new deep learning model designed for forecasting in graph-based, time-dependent contexts. GSA-Forecaster utilizes graph sequence attention, a new attention mechanism proposed in this paper, to effectively manage temporal dependencies. GSA-Forecaster integrates the data's graph structure directly into its architecture, addressing spatial dependencies. Additionally, it incorporates auxiliary information to refine its predictions further. We validate its performance using real-world graph-based, time-dependent datasets, where it demonstrates superior effectiveness compared to existing state-of-the-art models.

Paper Structure

This paper contains 38 sections, 1 theorem, 19 equations, 16 figures, 6 tables.

Key Result

Theorem 1

When predicting the $({t+k})^\mathrm{th}$ graph signal, if the similarity scores satisfy the following condition: then the extended attention score ${\alpha}_{t+k,\,t+k}'^{(h)}$ (representing the attention given to the recent trend) satisfies: where $T$ is the number of historical graph signals and $M$ denotes the size of the temporal neighborhood.

Figures (16)

  • Figure 1: Definition of graph-based time-dependent data.
  • Figure 2: Example of graph-based time-dependent data.
  • Figure 3: Illustration of limitations of standard attention. In this example, there is only a single node in the graph structure.
  • Figure 4: Illustration of Graph Sequence Attention and how it addresses the limitation of standard attention.
  • Figure 5: llustrating spatial dependency --- This chart presents the hourly taxi demand near New York Penn Station, Grand Central Terminal, and the Empire State Building, covering the period from 12:00 a.m. on Sunday, March 6, 2016, to 11:59 p.m. on Saturday, March 12, 2016. The x-axis tick labels indicate the start of each date (for instance, '06' denotes the beginning of March 6, 2016, at 12:00 a.m.).
  • ...and 11 more figures

Theorems & Definitions (1)

  • Theorem 1