Table of Contents
Fetching ...

HSTMixer: A Hierarchical MLP-Mixer for Large-Scale Traffic Forecasting

Yongyao Wang, Jingyuan Wang, Xie Yu, Jiahao Ji, Chao Li

TL;DR

HSTMixer introduces a hierarchical, all-MLP framework for large-scale traffic forecasting to overcome the scalability and noise challenges of transformer and GNN-based approaches. By stacking spatiotemporal mixing blocks that perform temporal aggregation and spatial cascade mixing, and by incorporating an adaptive region mixer with a region-aware parameter pool, the model builds multi-scale representations with reduced global noise. Extensive experiments on four large-scale real-world datasets demonstrate state-of-the-art accuracy and favorable efficiency, with ablations validating the importance of hierarchical processing and adaptive semantics. The work offers a scalable alternative for city-wide traffic prediction and provides insights into multi-resolution spatiotemporal feature learning and region-level semantics.

Abstract

Traffic forecasting task is significant to modern urban management. Recently, there is growing attention on large-scale forecasting, as it better reflects the complexity of real-world traffic networks. However, existing models often exhibit quadratic computational complexity, making them impractical for large-scale real-world scenarios. In this paper, we propose a novel framework, Hierarchical Spatio-Temporal Mixer (HSTMixer), which leverages an all-MLP architecture for efficient and effective large-scale traffic forecasting. HSTMixer employs a hierarchical spatiotemporal mixing block to extract multi-resolution features through bottom-up aggregation and top-down propagation. Furthermore, an adaptive region mixer generates transformation matrices based on regional semantics, enabling our model to dynamically capture evolving spatiotemporal patterns for different regions. Extensive experiments conducted on four large-scale real-world datasets demonstrate that the proposed method not only achieves state-of-the-art performance but also exhibits competitive computational efficiency.

HSTMixer: A Hierarchical MLP-Mixer for Large-Scale Traffic Forecasting

TL;DR

HSTMixer introduces a hierarchical, all-MLP framework for large-scale traffic forecasting to overcome the scalability and noise challenges of transformer and GNN-based approaches. By stacking spatiotemporal mixing blocks that perform temporal aggregation and spatial cascade mixing, and by incorporating an adaptive region mixer with a region-aware parameter pool, the model builds multi-scale representations with reduced global noise. Extensive experiments on four large-scale real-world datasets demonstrate state-of-the-art accuracy and favorable efficiency, with ablations validating the importance of hierarchical processing and adaptive semantics. The work offers a scalable alternative for city-wide traffic prediction and provides insights into multi-resolution spatiotemporal feature learning and region-level semantics.

Abstract

Traffic forecasting task is significant to modern urban management. Recently, there is growing attention on large-scale forecasting, as it better reflects the complexity of real-world traffic networks. However, existing models often exhibit quadratic computational complexity, making them impractical for large-scale real-world scenarios. In this paper, we propose a novel framework, Hierarchical Spatio-Temporal Mixer (HSTMixer), which leverages an all-MLP architecture for efficient and effective large-scale traffic forecasting. HSTMixer employs a hierarchical spatiotemporal mixing block to extract multi-resolution features through bottom-up aggregation and top-down propagation. Furthermore, an adaptive region mixer generates transformation matrices based on regional semantics, enabling our model to dynamically capture evolving spatiotemporal patterns for different regions. Extensive experiments conducted on four large-scale real-world datasets demonstrate that the proposed method not only achieves state-of-the-art performance but also exhibits competitive computational efficiency.

Paper Structure

This paper contains 34 sections, 14 equations, 9 figures, 5 tables.

Figures (9)

  • Figure 1: Attention performance on graphs of different scales. In small-scale graphs with fewer nodes, attention can effectively capture the correlations between nodes. However, as the number of nodes increases in large-scale, the presence of significant noise dilutes the effectiveness.
  • Figure 2: Spatiotemporal Hierarchy. At the macro-level, temporal data is driven by periodicity and trends (left), while spatial data is influenced by regional correlations, such as peak hours between residential and work areas or increased weekend traffic between residential and park areas (right). At the micro-level, neighboring samples exhibit similar values.
  • Figure 3: The overall framework of HSTMixer.
  • Figure 4: Structure of the Adaptive Region Mixer, where $\left( S_k \right) \times T_l \times d$ represents mixing along the spatial dimension $\left(S_k\right)$.
  • Figure 5: Results of ablation study. Verified the effectiveness of each component.
  • ...and 4 more figures