Table of Contents
Fetching ...

Multi-resolution Time-Series Transformer for Long-term Forecasting

Yitian Zhang, Liheng Ma, Soumyasundar Pal, Yingxue Zhang, Mark Coates

TL;DR

A novel framework, Multi-resolution Time-Series Transformer (MTST), which consists of a multi-branch architecture for simultaneous modeling of diverse temporal patterns at different resolutions, and employs relative positional encoding, which is better suited for extracting periodic components at different scales.

Abstract

The performance of transformers for time-series forecasting has improved significantly. Recent architectures learn complex temporal patterns by segmenting a time-series into patches and using the patches as tokens. The patch size controls the ability of transformers to learn the temporal patterns at different frequencies: shorter patches are effective for learning localized, high-frequency patterns, whereas mining long-term seasonalities and trends requires longer patches. Inspired by this observation, we propose a novel framework, Multi-resolution Time-Series Transformer (MTST), which consists of a multi-branch architecture for simultaneous modeling of diverse temporal patterns at different resolutions. In contrast to many existing time-series transformers, we employ relative positional encoding, which is better suited for extracting periodic components at different scales. Extensive experiments on several real-world datasets demonstrate the effectiveness of MTST in comparison to state-of-the-art forecasting techniques.

Multi-resolution Time-Series Transformer for Long-term Forecasting

TL;DR

A novel framework, Multi-resolution Time-Series Transformer (MTST), which consists of a multi-branch architecture for simultaneous modeling of diverse temporal patterns at different resolutions, and employs relative positional encoding, which is better suited for extracting periodic components at different scales.

Abstract

The performance of transformers for time-series forecasting has improved significantly. Recent architectures learn complex temporal patterns by segmenting a time-series into patches and using the patches as tokens. The patch size controls the ability of transformers to learn the temporal patterns at different frequencies: shorter patches are effective for learning localized, high-frequency patterns, whereas mining long-term seasonalities and trends requires longer patches. Inspired by this observation, we propose a novel framework, Multi-resolution Time-Series Transformer (MTST), which consists of a multi-branch architecture for simultaneous modeling of diverse temporal patterns at different resolutions. In contrast to many existing time-series transformers, we employ relative positional encoding, which is better suited for extracting periodic components at different scales. Extensive experiments on several real-world datasets demonstrate the effectiveness of MTST in comparison to state-of-the-art forecasting techniques.
Paper Structure (38 sections, 7 equations, 15 figures, 9 tables)

This paper contains 38 sections, 7 equations, 15 figures, 9 tables.

Figures (15)

  • Figure 1: An example from Electricity dataset: MTST learns multi-scale temporal patterns in different branches, where $P_i$ stands for the patch size in $i$-th branch.
  • Figure 2: Multi-resolution Time-Series Transformer (MTST) Architecture.
  • Figure 3:
  • Figure 4:
  • Figure 6: Boxplot for ranks of the algorithms (based on their MSE) across seven datasets and four prediction horizons. The medians and means of the ranks are shown by the vertical lines and the black triangles respectively; whiskers extend to the minimum and maximum ranks.
  • ...and 10 more figures