Efficient Traffic Prediction Through Spatio-Temporal Distillation
Qianru Zhang, Xinyi Gao, Haixin Wang, Siu-Ming Yiu, Hongzhi Yin
TL;DR
This work tackles scalability and over-smoothing in spatio-temporal graph neural networks for traffic forecasting by introducing LightST, a two-level spatio-temporal knowledge distillation framework that transfers knowledge from a GNN teacher to a lightweight MLP student. It combines explicit prediction-level distillation with implicit distribution alignment through contrastive learning across spatial and temporal embeddings, enhanced by adaptive embedding alignment to mitigate smoothing. Empirical results on five PeMS datasets show state-of-the-art accuracy with 5x–40x faster inference compared to baselines, validating practical applicability for real-time traffic forecasting. The approach also provides theoretical insights into reducing over-smoothing and demonstrates strong ablations supporting the necessity of both spatial and temporal distillation components.
Abstract
Graph neural networks (GNNs) have gained considerable attention in recent years for traffic flow prediction due to their ability to learn spatio-temporal pattern representations through a graph-based message-passing framework. Although GNNs have shown great promise in handling traffic datasets, their deployment in real-life applications has been hindered by scalability constraints arising from high-order message passing. Additionally, the over-smoothing problem of GNNs may lead to indistinguishable region representations as the number of layers increases, resulting in performance degradation. To address these challenges, we propose a new knowledge distillation paradigm termed LightST that transfers spatial and temporal knowledge from a high-capacity teacher to a lightweight student. Specifically, we introduce a spatio-temporal knowledge distillation framework that helps student MLPs capture graph-structured global spatio-temporal patterns while alleviating the over-smoothing effect with adaptive knowledge distillation. Extensive experiments verify that LightST significantly speeds up traffic flow predictions by 5X to 40X compared to state-of-the-art spatio-temporal GNNs, all while maintaining superior accuracy.
