Table of Contents
Fetching ...

Make Graph Neural Networks Great Again: A Generic Integration Paradigm of Topology-Free Patterns for Traffic Speed Prediction

Yicheng Zhou, Pengfei Wang, Hao Dong, Denghui Zhang, Dingqi Yang, Yanjie Fu, Pengyang Wang

TL;DR

This work develops a Dual Cross-Scale Transformer (DCST) architecture, including a Spatial Transformer and a Temporal Transformer, to preserve the cross-scale topology-free patterns and associated dynamics, respectively and proposes a distillation-style learning framework to integrate both topology-regularized/-free patterns.

Abstract

Urban traffic speed prediction aims to estimate the future traffic speed for improving urban transportation services. Enormous efforts have been made to exploit Graph Neural Networks (GNNs) for modeling spatial correlations and temporal dependencies of traffic speed evolving patterns, regularized by graph topology.While achieving promising results, current traffic speed prediction methods still suffer from ignoring topology-free patterns, which cannot be captured by GNNs. To tackle this challenge, we propose a generic model for enabling the current GNN-based methods to preserve topology-free patterns. Specifically, we first develop a Dual Cross-Scale Transformer (DCST) architecture, including a Spatial Transformer and a Temporal Transformer, to preserve the cross-scale topology-free patterns and associated dynamics, respectively. Then, to further integrate both topology-regularized/-free patterns, we propose a distillation-style learning framework, in which the existing GNN-based methods are considered as the teacher model, and the proposed DCST architecture is considered as the student model. The teacher model would inject the learned topology-regularized patterns into the student model for integrating topology-free patterns. The extensive experimental results demonstrated the effectiveness of our methods.

Make Graph Neural Networks Great Again: A Generic Integration Paradigm of Topology-Free Patterns for Traffic Speed Prediction

TL;DR

This work develops a Dual Cross-Scale Transformer (DCST) architecture, including a Spatial Transformer and a Temporal Transformer, to preserve the cross-scale topology-free patterns and associated dynamics, respectively and proposes a distillation-style learning framework to integrate both topology-regularized/-free patterns.

Abstract

Urban traffic speed prediction aims to estimate the future traffic speed for improving urban transportation services. Enormous efforts have been made to exploit Graph Neural Networks (GNNs) for modeling spatial correlations and temporal dependencies of traffic speed evolving patterns, regularized by graph topology.While achieving promising results, current traffic speed prediction methods still suffer from ignoring topology-free patterns, which cannot be captured by GNNs. To tackle this challenge, we propose a generic model for enabling the current GNN-based methods to preserve topology-free patterns. Specifically, we first develop a Dual Cross-Scale Transformer (DCST) architecture, including a Spatial Transformer and a Temporal Transformer, to preserve the cross-scale topology-free patterns and associated dynamics, respectively. Then, to further integrate both topology-regularized/-free patterns, we propose a distillation-style learning framework, in which the existing GNN-based methods are considered as the teacher model, and the proposed DCST architecture is considered as the student model. The teacher model would inject the learned topology-regularized patterns into the student model for integrating topology-free patterns. The extensive experimental results demonstrated the effectiveness of our methods.

Paper Structure

This paper contains 19 sections, 9 equations, 5 figures, 1 table.

Figures (5)

  • Figure 1: An example of topology-free patterns for traffic speed in the road network, where $R1$ is an arterial road near the business area in the new city district; $R2$ is an arterial road near the residential area in the old city district. During the morning and evening rush hours, the overwhelming traffic on $R1$ and $R2$ causes congestion.
  • Figure 2: Framework Overview. (a) Spatial Scale: The features of nodes located in the same grid are aggregated, and different scales are divided according to the size of the grid. (b) Temporal Scale: For each node, aggregate the features of time points within the same temporal segment and divide them into different scales based on the length of the temporal segment. (c) Dual Cross-Scale Transformer is composed of an Embedding Layer, a Temporal Transformer, a Spatial Transformer, and a Prediction Layer.
  • Figure 3: An illustration of the integration process of topology-regularized/-free patterns. The integration process follows the teacher-student paradigm, where the GNN-based model is taken as the teacher model (topology-regularized patterns), and the Dual Cross-Scale Transformer is taken as the student model (topology-free patterns). During the process, the GNN-based model has been pre-trained and kept fixed. The integration is conducted by jointly optimizing "Soft Loss" and "Hard Loss".
  • Figure 4: An illustration of DCST performances w.r.t. different trade-off parameter pairs. We present the results of MAPE on METRLA, PEMSBAY and PEMSD7(M).
  • Figure 5: Ablation Studies of STGCN-KD, DCRNN-KD, GWNet-KD, MTGNN-KD and AGCRN-KD on metrics MAE, RMSE and MAPE on the METRLA dataset.