Multi-scale Transformer Pyramid Networks for Multivariate Time Series Forecasting

Yifan Zhang; Rui Wu; Sergiu M. Dascalu; Frederick C. Harris

Multi-scale Transformer Pyramid Networks for Multivariate Time Series Forecasting

Yifan Zhang, Rui Wu, Sergiu M. Dascalu, Frederick C. Harris

TL;DR

This work tackles multivariate time series forecasting by addressing the limitation of existing transformer models that rely on single or exponentially growing temporal scales. It introduces a dimension invariant embedding (DI) that preserves both time steps and variables while projecting data into a higher-dimensional space, and a Multi-scale Transformer Pyramid Network (MTPNet) that models dependencies across unconstrained scales via a pyramid of encoder–decoder levels and inter-scale connections. The approach also decomposes the data into seasonal and trend components, enabling the seasonal MTPNet to forecast while a linear model handles trend, with predictions combined through a final convolution. On nine real-world benchmarks, MTPNet outperforms state-of-the-art baselines, with strong gains in MSE and MAE, demonstrating the practical value of flexible multi-scale temporal modeling for MTS forecasting.

Abstract

Multivariate Time Series (MTS) forecasting involves modeling temporal dependencies within historical records. Transformers have demonstrated remarkable performance in MTS forecasting due to their capability to capture long-term dependencies. However, prior work has been confined to modeling temporal dependencies at either a fixed scale or multiple scales that exponentially increase (most with base 2). This limitation hinders their effectiveness in capturing diverse seasonalities, such as hourly and daily patterns. In this paper, we introduce a dimension invariant embedding technique that captures short-term temporal dependencies and projects MTS data into a higher-dimensional space, while preserving the dimensions of time steps and variables in MTS data. Furthermore, we present a novel Multi-scale Transformer Pyramid Network (MTPNet), specifically designed to effectively capture temporal dependencies at multiple unconstrained scales. The predictions are inferred from multi-scale latent representations obtained from transformers at various scales. Extensive experiments on nine benchmark datasets demonstrate that the proposed MTPNet outperforms recent state-of-the-art methods.

Multi-scale Transformer Pyramid Networks for Multivariate Time Series Forecasting

TL;DR

Abstract

Paper Structure (27 sections, 10 equations, 4 figures, 3 tables)

This paper contains 27 sections, 10 equations, 4 figures, 3 tables.

Introduction
Related works
MTS forecasting
Transformers
Method
Decomposition
Transformer Feature Pyramid
Dimension Invariant Embedding
Transformer Encoder and Decoder
Encoder:
Decoder:
Multi-scale Prediction
Experiments
Experimental Settings
Data:
...and 12 more sections

Figures (4)

Figure 1: Illustration of the overall framework: Decomposition of MTS data into seasonal and trend-cyclical components, employing Multi-scale Transformer Pyramid Networks (MTPNet) as the seasonal model and a linear layer as the trend model. The seasonal and trend predictions are summed to obtain the final predictions.
Figure 2: Illustration of spatial, temporal, and dimension invariant embedding techniques.
Figure 3: Left: The workflow of a single-level transformer-based encoder-decoder pair. Right: Illustration of the proposed multi-scale transformer pyramid network (MTPNet).
Figure 4: The forecasting results in terms of MAE for different look-back window sizes at horizons 96 and 720.

Multi-scale Transformer Pyramid Networks for Multivariate Time Series Forecasting

TL;DR

Abstract

Multi-scale Transformer Pyramid Networks for Multivariate Time Series Forecasting

Authors

TL;DR

Abstract

Table of Contents

Figures (4)