Table of Contents
Fetching ...

Spatiotemporal Forecasting Meets Efficiency: Causal Graph Process Neural Networks

Aref Einizade, Fragkiskos D. Malliaros, Jhony H. Giraldo

TL;DR

This work tackles scalable spatiotemporal forecasting on graphs by replacing recurrent, 1-hop–restricted GNNs with Causal Graph Processes (CGPs) integrated into a non-linear neural framework, CGProNet. CGProNet leverages discrete polynomial graph filters $P(\mathbf{A},\boldsymbol{\theta}_i)=\sum_{j=0}^{i}{\theta_{ij}\mathbf{A}^j}$ and a non-linear aggregation to produce forecasts $\tilde{\mathbf{x}}_k=\sum_{i=1}^M{\alpha_i \sigma(\sum_{j=0}^{i}{\theta_{ij}\mathbf{A}^j} \mathbf{x}_{k-i})}$, enabling global and local temporal-spatial interactions with far fewer parameters than RNN-based approaches. The paper provides a stability analysis showing the prediction error bound depends on adjacency perturbations $\delta_{\mathbf{A}}$, filter deviations $\delta_{\boldsymbol{\theta}}$, and sparsity-promoting regularization, and demonstrates memory and runtime efficiency across synthetic and real datasets while maintaining competitive forecasting accuracy. Extended results include multi-horizon forecasting variants and an exploration of continuous filters (C2GProNet) in the appendix. Overall, CGProNet offers a practical, resource-efficient alternative for real-time spatiotemporal forecasting on large, sparse graphs.

Abstract

Graph Neural Networks (GNNs) have advanced spatiotemporal forecasting by leveraging relational inductive biases among sensors (or any other measuring scheme) represented as nodes in a graph. However, current methods often rely on Recurrent Neural Networks (RNNs), leading to increased runtimes and memory use. Moreover, these methods typically operate within 1-hop neighborhoods, exacerbating the reduction of the receptive field. Causal Graph Processes (CGPs) offer an alternative, using graph filters instead of MLP layers to reduce parameters and minimize memory consumption. This paper introduces the Causal Graph Process Neural Network (CGProNet), a non-linear model combining CGPs and GNNs for spatiotemporal forecasting. CGProNet employs higher-order graph filters, optimizing the model with fewer parameters, reducing memory usage, and improving runtime efficiency. We present a comprehensive theoretical and experimental stability analysis, highlighting key aspects of CGProNet. Experiments on synthetic and real data demonstrate CGProNet's superior efficiency, minimizing memory and time requirements while maintaining competitive forecasting performance.

Spatiotemporal Forecasting Meets Efficiency: Causal Graph Process Neural Networks

TL;DR

This work tackles scalable spatiotemporal forecasting on graphs by replacing recurrent, 1-hop–restricted GNNs with Causal Graph Processes (CGPs) integrated into a non-linear neural framework, CGProNet. CGProNet leverages discrete polynomial graph filters and a non-linear aggregation to produce forecasts , enabling global and local temporal-spatial interactions with far fewer parameters than RNN-based approaches. The paper provides a stability analysis showing the prediction error bound depends on adjacency perturbations , filter deviations , and sparsity-promoting regularization, and demonstrates memory and runtime efficiency across synthetic and real datasets while maintaining competitive forecasting accuracy. Extended results include multi-horizon forecasting variants and an exploration of continuous filters (C2GProNet) in the appendix. Overall, CGProNet offers a practical, resource-efficient alternative for real-time spatiotemporal forecasting on large, sparse graphs.

Abstract

Graph Neural Networks (GNNs) have advanced spatiotemporal forecasting by leveraging relational inductive biases among sensors (or any other measuring scheme) represented as nodes in a graph. However, current methods often rely on Recurrent Neural Networks (RNNs), leading to increased runtimes and memory use. Moreover, these methods typically operate within 1-hop neighborhoods, exacerbating the reduction of the receptive field. Causal Graph Processes (CGPs) offer an alternative, using graph filters instead of MLP layers to reduce parameters and minimize memory consumption. This paper introduces the Causal Graph Process Neural Network (CGProNet), a non-linear model combining CGPs and GNNs for spatiotemporal forecasting. CGProNet employs higher-order graph filters, optimizing the model with fewer parameters, reducing memory usage, and improving runtime efficiency. We present a comprehensive theoretical and experimental stability analysis, highlighting key aspects of CGProNet. Experiments on synthetic and real data demonstrate CGProNet's superior efficiency, minimizing memory and time requirements while maintaining competitive forecasting performance.
Paper Structure (33 sections, 7 theorems, 46 equations, 4 figures, 16 tables)

This paper contains 33 sections, 7 theorems, 46 equations, 4 figures, 16 tables.

Key Result

Proposition 1

Consider the graph filter deviations as $\hat{\theta}_{ij}=\theta_{ij}+z_{ij}$. Then, with the assumptions of the adjacency matrix powers $\{\mathbf{A}^i\}_{i=1}^M$, error matrix $\mathbf{E}=\hat{\mathbf{A}}-\mathbf{A}$, graph filter coefficients $\boldsymbol{\theta}_i$ and their associated deviatio where $\hat{L}_M(\delta)=\max_{1\le i\le M}{\frac{(L+\delta)^i-L^i}{\delta}}$.

Figures (4)

  • Figure 1: Comparison of Mean Squared Error (MSE), GPU memory consumption, and runtime in the forecasting task on the AirQuality dataset.
  • Figure 2: Pipeline of CGProNet. We process subsequent $M$ spatial time steps ($\mathbf{x}_{k-1},\mathbf{x}_{k-2},\dots,\mathbf{x}_{k-M}$) with polynomial graph filters $\{\texttt{GF}_{\mathcal{G},\boldsymbol{\theta}_i}\}_{i=1}^M$ to forecast $\mathbf{x}_{k}$ after the aggregation function $\texttt{AGG}\left(\{\texttt{GF}_{\mathcal{G},\boldsymbol{\theta}_i}\mathbf{x}_{k-i}\}_{i=1}^M \right)$. Finally, we optimize the difference between the predicted $\tilde{\mathbf{x}}_k$ and ground-truth time step $\mathbf{x}_k$ to train our CGProNet model.
  • Figure 3: Forecasting performance comparison in the case of limited training data.
  • Figure 4: The averaged rMSE vs. sparsity $p$ with different SNRs for the synthetic data.

Theorems & Definitions (14)

  • Definition 1
  • Proposition 1
  • Proposition 2
  • Theorem 1
  • proof
  • Lemma 2
  • proof
  • Lemma 3
  • proof
  • proof
  • ...and 4 more