HSTMixer: A Hierarchical MLP-Mixer for Large-Scale Traffic Forecasting
Yongyao Wang, Jingyuan Wang, Xie Yu, Jiahao Ji, Chao Li
TL;DR
HSTMixer introduces a hierarchical, all-MLP framework for large-scale traffic forecasting to overcome the scalability and noise challenges of transformer and GNN-based approaches. By stacking spatiotemporal mixing blocks that perform temporal aggregation and spatial cascade mixing, and by incorporating an adaptive region mixer with a region-aware parameter pool, the model builds multi-scale representations with reduced global noise. Extensive experiments on four large-scale real-world datasets demonstrate state-of-the-art accuracy and favorable efficiency, with ablations validating the importance of hierarchical processing and adaptive semantics. The work offers a scalable alternative for city-wide traffic prediction and provides insights into multi-resolution spatiotemporal feature learning and region-level semantics.
Abstract
Traffic forecasting task is significant to modern urban management. Recently, there is growing attention on large-scale forecasting, as it better reflects the complexity of real-world traffic networks. However, existing models often exhibit quadratic computational complexity, making them impractical for large-scale real-world scenarios. In this paper, we propose a novel framework, Hierarchical Spatio-Temporal Mixer (HSTMixer), which leverages an all-MLP architecture for efficient and effective large-scale traffic forecasting. HSTMixer employs a hierarchical spatiotemporal mixing block to extract multi-resolution features through bottom-up aggregation and top-down propagation. Furthermore, an adaptive region mixer generates transformation matrices based on regional semantics, enabling our model to dynamically capture evolving spatiotemporal patterns for different regions. Extensive experiments conducted on four large-scale real-world datasets demonstrate that the proposed method not only achieves state-of-the-art performance but also exhibits competitive computational efficiency.
