Table of Contents
Fetching ...

ARM: Refining Multivariate Forecasting with Adaptive Temporal-Contextual Learning

Jiecheng Lu, Xu Han, Shihao Yang

TL;DR

The paper tackles the challenge of multivariate long-term time series forecasting by identifying that existing multivariate Transformers struggle to model series-wise differences. It proposes ARM, a modular framework with Adaptive Univariate Effect Learning (AUEL), Random Dropping (RD), and Multi-kernel Local Smoothing (MKLS), to disentangle univariate patterns, decouple inter-series dependencies during training, and construct flexible local temporal representations. The approach yields state-of-the-art results on nine datasets with only a modest increase in computation and is transferable to other LTSF architectures beyond vanilla Transformers. Collectively, AUEL, RD, and MKLS enable robust, scalable handling of diverse multivariate time series, offering practical improvements for real-world forecasting tasks.

Abstract

Long-term time series forecasting (LTSF) is important for various domains but is confronted by challenges in handling the complex temporal-contextual relationships. As multivariate input models underperforming some recent univariate counterparts, we posit that the issue lies in the inefficiency of existing multivariate LTSF Transformers to model series-wise relationships: the characteristic differences between series are often captured incorrectly. To address this, we introduce ARM: a multivariate temporal-contextual adaptive learning method, which is an enhanced architecture specifically designed for multivariate LTSF modelling. ARM employs Adaptive Univariate Effect Learning (AUEL), Random Dropping (RD) training strategy, and Multi-kernel Local Smoothing (MKLS), to better handle individual series temporal patterns and correctly learn inter-series dependencies. ARM demonstrates superior performance on multiple benchmarks without significantly increasing computational costs compared to vanilla Transformer, thereby advancing the state-of-the-art in LTSF. ARM is also generally applicable to other LTSF architecture beyond vanilla Transformer.

ARM: Refining Multivariate Forecasting with Adaptive Temporal-Contextual Learning

TL;DR

The paper tackles the challenge of multivariate long-term time series forecasting by identifying that existing multivariate Transformers struggle to model series-wise differences. It proposes ARM, a modular framework with Adaptive Univariate Effect Learning (AUEL), Random Dropping (RD), and Multi-kernel Local Smoothing (MKLS), to disentangle univariate patterns, decouple inter-series dependencies during training, and construct flexible local temporal representations. The approach yields state-of-the-art results on nine datasets with only a modest increase in computation and is transferable to other LTSF architectures beyond vanilla Transformers. Collectively, AUEL, RD, and MKLS enable robust, scalable handling of diverse multivariate time series, offering practical improvements for real-world forecasting tasks.

Abstract

Long-term time series forecasting (LTSF) is important for various domains but is confronted by challenges in handling the complex temporal-contextual relationships. As multivariate input models underperforming some recent univariate counterparts, we posit that the issue lies in the inefficiency of existing multivariate LTSF Transformers to model series-wise relationships: the characteristic differences between series are often captured incorrectly. To address this, we introduce ARM: a multivariate temporal-contextual adaptive learning method, which is an enhanced architecture specifically designed for multivariate LTSF modelling. ARM employs Adaptive Univariate Effect Learning (AUEL), Random Dropping (RD) training strategy, and Multi-kernel Local Smoothing (MKLS), to better handle individual series temporal patterns and correctly learn inter-series dependencies. ARM demonstrates superior performance on multiple benchmarks without significantly increasing computational costs compared to vanilla Transformer, thereby advancing the state-of-the-art in LTSF. ARM is also generally applicable to other LTSF architecture beyond vanilla Transformer.
Paper Structure (31 sections, 4 equations, 7 figures, 7 tables)

This paper contains 31 sections, 4 equations, 7 figures, 7 tables.

Figures (7)

  • Figure 1: Three intuitive problems arising from wrongly handling input series with characteristic differences (further explained in § \ref{['figurecaption']}). (a) Adaptive estimation of output mean necessity for series with diverse characteristics (see § \ref{['AUEL']}). The Adaptive EMA (green) in our method outperforms previous approaches like RevIN (yellow) and NLinear (red) by adapting to varied temporal dependencies. (b) The necessity of building reasonable multivariate temporal representation (see § \ref{['MKLS']}). Our multi-window local convolutional module addresses the inadequacy of former models in modeling different local patterns across series. (c) The Inability of Previous Models to Learn Obvious Inter-Series Dependencies: Demonstrated using the "Multi" dataset generated with simple shifting (see \ref{['independence']}), existing models like Autoformer fail to learn this simple "copy-paste" operations. See Figure \ref{['multi192']} for the visualization to show our ARM's performance boost in this dataset.
  • Figure 2: Overall Architecture of ARM with Vanilla Transformer as encoder-decoder. The left side depicts the global workflow and the right side illustrates the specific process in the model training.
  • Figure 3: This figure illustrates six LTSF model types, categorized as univariate (Methods a-c) and multivariate (Methods d-f). The figure's parallel arrows show independent series processing, and converging arrows indicate series mixing. Univariate models (a-c) process series separately: (a) employs individual models like DLinear for each series, (b) uses shared-parameter models such as PatchTST for all series, and (c) dynamically chooses best models per series, as in § \ref{['AUEL']}. Multivariate models (d-f) capture inter-series dependencies: (d) uses standard structures as Informer and Autoformer, outperformed by univariate methods; (e) combines univariate models with multivariate factors to better build inter-series dependencies but adding computational complexity; (f) introduces our Random Dropping strategy (see § \ref{['RandomDropping']}) for efficient learning of inter-series relationships.
  • Figure 4: Structure of Multi-kernel Local Smoothing (MKLS). The central part of the figure illustrates the computation of MKLS, incorporating multiple 1D convolutions and channel attention. The left side and right side presents the application method of Pre-MKLS and Post-MKLS, respectively.
  • Figure 5: Predictability Analysis of Datasets (Elec. means Electricity)
  • ...and 2 more figures