Navigating Spatio-Temporal Heterogeneity: A Graph Transformer Approach for Traffic Forecasting
Jianxiang Zhou, Erdong Liu, Wei Chen, Siru Zhong, Yuxuan Liang
TL;DR
The paper addresses the challenge of forecasting traffic by modeling both spatio-temporal dependencies and heterogeneity across space and time. It introduces STGormer, a Transformer-based framework that fuses simple spatial encodings (degree centrality, SPD-based bias) with Time2Vec temporal embeddings and MoE-enhanced feedforward networks, enabling specialized processing of diverse traffic patterns. Empirical results on three real-world datasets show STGormer achieving state-of-the-art accuracy, with ablations confirming the importance of temporal/spatial encodings and MoE routing. The approach offers a practical, scalable improvement for smart-city traffic forecasting and provides insights into how heterogeneity can be captured via gated expert mixtures and graph-aware attention.
Abstract
Traffic forecasting has emerged as a crucial research area in the development of smart cities. Although various neural networks with intricate architectures have been developed to address this problem, they still face two key challenges: i) Recent advancements in network designs for modeling spatio-temporal correlations are starting to see diminishing returns in performance enhancements. ii) Additionally, most models do not account for the spatio-temporal heterogeneity inherent in traffic data, i.e., traffic distribution varies significantly across different regions and traffic flow patterns fluctuate across various time slots. To tackle these challenges, we introduce the Spatio-Temporal Graph Transformer (STGormer), which effectively integrates attribute and structure information inherent in traffic data for learning spatio-temporal correlations, and a mixture-of-experts module for capturing heterogeneity along spaital and temporal axes. Specifically, we design two straightforward yet effective spatial encoding methods based on the graph structure and integrate time position encoding into the vanilla transformer to capture spatio-temporal traffic patterns. Additionally, a mixture-of-experts enhanced feedforward neural network (FNN) module adaptively assigns suitable expert layers to distinct patterns via a spatio-temporal gating network, further improving overall prediction accuracy. Experiments on real-world traffic datasets demonstrate that STGormer achieves state-of-the-art performance.
