Table of Contents
Fetching ...

TESTAM: A Time-Enhanced Spatio-Temporal Attention Model with Mixture of Experts

Hyunwook Lee, Sungahn Ko

TL;DR

This work tackles the challenge of accurate traffic forecasting under complex spatio-temporal dependencies and event-driven changes. It introduces TESTAM, a time-enhanced spatio-temporal attention model built as a Mixture-of-Experts with three spatially diverse experts and a memory-based gating mechanism to route contextually. Key innovations include Temporal Information Embedding via Time2Vec and Time-Enhanced Attention, enabling effective transfer from historical data to forecasts. Experiments on METR-LA, PEMS-BAY, and EXPY-TKY show state-of-the-art performance, especially for long-horizon and non-recurring events, with computational efficiency and interpretable routing behavior guiding expert selection.

Abstract

Accurate traffic forecasting is challenging due to the complex dependency on road networks, various types of roads, and the abrupt speed change due to the events. Recent works mainly focus on dynamic spatial modeling with adaptive graph embedding or graph attention having less consideration for temporal characteristics and in-situ modeling. In this paper, we propose a novel deep learning model named TESTAM, which individually models recurring and non-recurring traffic patterns by a mixture-of-experts model with three experts on temporal modeling, spatio-temporal modeling with static graph, and dynamic spatio-temporal dependency modeling with dynamic graph. By introducing different experts and properly routing them, TESTAM could better model various circumstances, including spatially isolated nodes, highly related nodes, and recurring and non-recurring events. For the proper routing, we reformulate a gating problem into a classification problem with pseudo labels. Experimental results on three public traffic network datasets, METR-LA, PEMS-BAY, and EXPY-TKY, demonstrate that TESTAM achieves a better indication and modeling of recurring and non-recurring traffic. We published the official code at https://github.com/HyunWookL/TESTAM

TESTAM: A Time-Enhanced Spatio-Temporal Attention Model with Mixture of Experts

TL;DR

This work tackles the challenge of accurate traffic forecasting under complex spatio-temporal dependencies and event-driven changes. It introduces TESTAM, a time-enhanced spatio-temporal attention model built as a Mixture-of-Experts with three spatially diverse experts and a memory-based gating mechanism to route contextually. Key innovations include Temporal Information Embedding via Time2Vec and Time-Enhanced Attention, enabling effective transfer from historical data to forecasts. Experiments on METR-LA, PEMS-BAY, and EXPY-TKY show state-of-the-art performance, especially for long-horizon and non-recurring events, with computational efficiency and interpretable routing behavior guiding expert selection.

Abstract

Accurate traffic forecasting is challenging due to the complex dependency on road networks, various types of roads, and the abrupt speed change due to the events. Recent works mainly focus on dynamic spatial modeling with adaptive graph embedding or graph attention having less consideration for temporal characteristics and in-situ modeling. In this paper, we propose a novel deep learning model named TESTAM, which individually models recurring and non-recurring traffic patterns by a mixture-of-experts model with three experts on temporal modeling, spatio-temporal modeling with static graph, and dynamic spatio-temporal dependency modeling with dynamic graph. By introducing different experts and properly routing them, TESTAM could better model various circumstances, including spatially isolated nodes, highly related nodes, and recurring and non-recurring events. For the proper routing, we reformulate a gating problem into a classification problem with pseudo labels. Experimental results on three public traffic network datasets, METR-LA, PEMS-BAY, and EXPY-TKY, demonstrate that TESTAM achieves a better indication and modeling of recurring and non-recurring traffic. We published the official code at https://github.com/HyunWookL/TESTAM
Paper Structure (40 sections, 12 equations, 7 figures, 4 tables)

This paper contains 40 sections, 12 equations, 7 figures, 4 tables.

Figures (7)

  • Figure 1: Overview of TESTAM. Left: The architecture of each expert. Middle: The workflow and routing mechanism of TESTAM. Solid lines indicate forward paths, and the dashed lines represent backward paths. Right: The three spatial modeling methods of TESTAM. The black lines indicate spatial connectivity, and red lines represent information flow corresponding to spatial connectivity. Identity, adaptive, and attention experts are responsible for temporal modeling, spatial modeling with learnable static graph, and with dynamic graph (i.e., attention), respectively.
  • Figure 2: Visualization of a recurring pattern in a hard-to-predict road, Road 1349, from Dec 14th to Dec 17th. Road 1349 is a highway entrance located near Tokyo station. The locations of the roads are indicated in Fig. \ref{['fig:appendix_HE']}.
  • Figure 3: Qualitative forecasting result analysis for spatially isolated roads (I). The locations of the roads are indicated in Fig. \ref{['fig:appendix_I']}.
  • Figure 4: Qualitative forecasting result analysis for Road 1111, a highway ramp located in Shibuya, with unique traffic patterns. The locations of the roads are indicated in Fig. \ref{['fig:appendix_HE']}.
  • Figure 5: Qualitative analysis results for Road 1196 (a metropolitan expressway) on Dec. 14th, where traffic control may occur because of heavy snow (red boxes)
  • ...and 2 more figures