Multi-resolution Time-Series Transformer for Long-term Forecasting

Yitian Zhang; Liheng Ma; Soumyasundar Pal; Yingxue Zhang; Mark Coates

Multi-resolution Time-Series Transformer for Long-term Forecasting

Yitian Zhang, Liheng Ma, Soumyasundar Pal, Yingxue Zhang, Mark Coates

TL;DR

A novel framework, Multi-resolution Time-Series Transformer (MTST), which consists of a multi-branch architecture for simultaneous modeling of diverse temporal patterns at different resolutions, and employs relative positional encoding, which is better suited for extracting periodic components at different scales.

Abstract

The performance of transformers for time-series forecasting has improved significantly. Recent architectures learn complex temporal patterns by segmenting a time-series into patches and using the patches as tokens. The patch size controls the ability of transformers to learn the temporal patterns at different frequencies: shorter patches are effective for learning localized, high-frequency patterns, whereas mining long-term seasonalities and trends requires longer patches. Inspired by this observation, we propose a novel framework, Multi-resolution Time-Series Transformer (MTST), which consists of a multi-branch architecture for simultaneous modeling of diverse temporal patterns at different resolutions. In contrast to many existing time-series transformers, we employ relative positional encoding, which is better suited for extracting periodic components at different scales. Extensive experiments on several real-world datasets demonstrate the effectiveness of MTST in comparison to state-of-the-art forecasting techniques.

Multi-resolution Time-Series Transformer for Long-term Forecasting

TL;DR

Abstract

Paper Structure (38 sections, 7 equations, 15 figures, 9 tables)

This paper contains 38 sections, 7 equations, 15 figures, 9 tables.

INTRODUCTION
PROBLEM STATEMENT
METHODOLOGY
Branch specific tokenization
Self-attention
Relative positional encoding
Fusing representations from all branches
RELATED WORK
Long-horizon Time-Series Forecasting
Multi-scale Feature Learning
Positional Encoding
EXPERIMENTS
Benchmarking MTST
Datasets
Baselines and Experimental Setup
...and 23 more sections

Figures (15)

Figure 1: An example from Electricity dataset: MTST learns multi-scale temporal patterns in different branches, where $P_i$ stands for the patch size in $i$-th branch.
Figure 2: Multi-resolution Time-Series Transformer (MTST) Architecture.
Figure 3:
Figure 4:
Figure 6: Boxplot for ranks of the algorithms (based on their MSE) across seven datasets and four prediction horizons. The medians and means of the ranks are shown by the vertical lines and the black triangles respectively; whiskers extend to the minimum and maximum ranks.
...and 10 more figures

Multi-resolution Time-Series Transformer for Long-term Forecasting

TL;DR

Abstract

Multi-resolution Time-Series Transformer for Long-term Forecasting

Authors

TL;DR

Abstract

Table of Contents

Figures (15)