Inductive Spatial Temporal Prediction Under Data Drift with Informative Graph Neural Network
Jialun Zheng, Divya Saxena, Jiannong Cao, Hanchen Yang, Penghui Ruan
TL;DR
This work tackles inductive spatial temporal prediction under data drift, where external events and expanding entity sets shift data distributions. It introduces INF-GNN, which distills diversified invariant patterns through a novel Relation Importance ($RI$) metric to form an informative subgraph and uses an informative temporal memory buffer to emphasize influential timestamps; both are integrated via RI loss optimization that combines standard loss with Elastic Weight Consolidation and RI-based regularization. On the PEMS3-Stream traffic dataset, INF-GNN achieves state-of-the-art performance for both existing and newly added nodes under substantial drift, demonstrating improved generalization and robustness in dynamic graphs. The approach offers interpretable pattern consolidation and practical applicability to rapidly changing spatio-temporal systems.
Abstract
Inductive spatial temporal prediction can generalize historical data to predict unseen data, crucial for highly dynamic scenarios (e.g., traffic systems, stock markets). However, external events (e.g., urban structural growth, market crash) and emerging new entities (e.g., locations, stocks) can undermine prediction accuracy by inducing data drift over time. Most existing studies extract invariant patterns to counter data drift but ignore pattern diversity, exhibiting poor generalization to unseen entities. To address this issue, we design an Informative Graph Neural Network (INF-GNN) to distill diversified invariant patterns and improve prediction accuracy under data drift. Firstly, we build an informative subgraph with a uniquely designed metric, Relation Importance (RI), that can effectively select stable entities and distinct spatial relationships. This subgraph further generalizes new entities' data via neighbors merging. Secondly, we propose an informative temporal memory buffer to help the model emphasize valuable timestamps extracted using influence functions within time intervals. This memory buffer allows INF-GNN to discern influential temporal patterns. Finally, RI loss optimization is designed for pattern consolidation. Extensive experiments on real-world dataset under substantial data drift demonstrate that INF-GNN significantly outperforms existing alternatives.
