Table of Contents
Fetching ...

LSTTN: A Long-Short Term Transformer-based Spatio-temporal Neural Network for Traffic Flow Forecasting

Qinyao Luo, Silu He, Xing Han, Yuhan Wang, Haifeng Li

TL;DR

This work proposes a novel LSTTN (Long-Short Term Transformer-based Network) framework comprehensively considering the long- and short-term features in historical traffic flow, and adopts a short-term trend extractor to learn fine-grained short-term temporal features.

Abstract

Accurate traffic forecasting is a fundamental problem in intelligent transportation systems and learning long-range traffic representations with key information through spatiotemporal graph neural networks (STGNNs) is a basic assumption of current traffic flow prediction models. However, due to structural limitations, existing STGNNs can only utilize short-range traffic flow data; therefore, the models cannot adequately learn the complex trends and periodic features in traffic flow. Besides, it is challenging to extract the key temporal information from the long historical traffic series and obtain a compact representation. To solve the above problems, we propose a novel LSTTN (Long-Short Term Transformer-based Network) framework comprehensively considering the long- and short-term features in historical traffic flow. First, we employ a masked subseries Transformer to infer the content of masked subseries from a small portion of unmasked subseries and their temporal context in a pretraining manner, forcing the model to efficiently learn compressed and contextual subseries temporal representations from long historical series. Then, based on the learned representations, long-term trend is extracted by using stacked 1D dilated convolution layers, and periodic features are extracted by dynamic graph convolution layers. For the difficulties in making time-step level prediction, LSTTN adopts a short-term trend extractor to learn fine-grained short-term temporal features. Finally, LSTTN fuses the long-term trend, periodic features and short-term features to obtain the prediction results. Experiments on four real-world datasets show that in 60-minute-ahead long-term forecasting, the LSTTN model achieves a minimum improvement of 5.63\% and a maximum improvement of 16.78\% over baseline models. The source code is available at https://github.com/GeoX-Lab/LSTTN.

LSTTN: A Long-Short Term Transformer-based Spatio-temporal Neural Network for Traffic Flow Forecasting

TL;DR

This work proposes a novel LSTTN (Long-Short Term Transformer-based Network) framework comprehensively considering the long- and short-term features in historical traffic flow, and adopts a short-term trend extractor to learn fine-grained short-term temporal features.

Abstract

Accurate traffic forecasting is a fundamental problem in intelligent transportation systems and learning long-range traffic representations with key information through spatiotemporal graph neural networks (STGNNs) is a basic assumption of current traffic flow prediction models. However, due to structural limitations, existing STGNNs can only utilize short-range traffic flow data; therefore, the models cannot adequately learn the complex trends and periodic features in traffic flow. Besides, it is challenging to extract the key temporal information from the long historical traffic series and obtain a compact representation. To solve the above problems, we propose a novel LSTTN (Long-Short Term Transformer-based Network) framework comprehensively considering the long- and short-term features in historical traffic flow. First, we employ a masked subseries Transformer to infer the content of masked subseries from a small portion of unmasked subseries and their temporal context in a pretraining manner, forcing the model to efficiently learn compressed and contextual subseries temporal representations from long historical series. Then, based on the learned representations, long-term trend is extracted by using stacked 1D dilated convolution layers, and periodic features are extracted by dynamic graph convolution layers. For the difficulties in making time-step level prediction, LSTTN adopts a short-term trend extractor to learn fine-grained short-term temporal features. Finally, LSTTN fuses the long-term trend, periodic features and short-term features to obtain the prediction results. Experiments on four real-world datasets show that in 60-minute-ahead long-term forecasting, the LSTTN model achieves a minimum improvement of 5.63\% and a maximum improvement of 16.78\% over baseline models. The source code is available at https://github.com/GeoX-Lab/LSTTN.
Paper Structure (27 sections, 12 equations, 10 figures, 6 tables)

This paper contains 27 sections, 12 equations, 10 figures, 6 tables.

Figures (10)

  • Figure 1: A snapshot of a historical time series of sensor 400236 and sensor 400240 in the PEMS-BAY dataset. The sequences in the upper figure correspond to the green window in the lower figure. The short-term historical trend of 00:00-0–6:00 is not sufficient for the model to accurately predict future traffic flow changes, but considering the long-term trend and periodicity in the long historical series can help the model better determine future trends.
  • Figure 2: An overview of the LSTTN framework. The long historical time series $\mathbf{X}_\mathrm{long}$ is split into subseries of equal lengths, and the last subseries is taken as the short-term historical time series $\mathbf{X}_\mathrm{short}$. The subseries-level temporal representation $\mathbf{S}$, which is rich in key temporal information, is learned by the subseries temporal representation learner, and then, long-term trend and periodic features are extracted from $\mathbf{S}$. Meanwhile, time-step level short-term trend are extracted directly from $\mathbf{X}_\mathrm{short}$. Finally, the long-term and short-term features are fused to obtain the prediction results.
  • Figure 3: The structure of masked subseries Transformer
  • Figure 4: A demonstration of the 1-dimensional dilated convolution layer. The yellow parallelograms at the bottom represent input data, and the red parallelogram represents the output. Layers in the middle are hidden layers. As the number of layers increases, the receptive field grows exponentially, enabling the efficient capture of long-range features.
  • Figure 5: A demonstration of the spatial-based graph convolution module. In this module, the spatial dependencies between two nodes are supposed to come from three diffusions, forward diffusion and backward diffusion that are defined by graph structure, as well as the diffusion hidden in graph. If there is no available graph structure, this module will only capture the dependencies which are from hidden diffusion and depicted by the self-adaptive neighbor matrix.
  • ...and 5 more figures

Theorems & Definitions (2)

  • Definition 1: Traffic network
  • Definition 2: Traffic forecasting problem