Table of Contents
Fetching ...

HN-MVTS: HyperNetwork-based Multivariate Time Series Forecasting

Andrey Savchenko, Oleg Kachan

TL;DR

HN-MVTS introduces a hypernetwork-driven prior that generates channel-specific final-layer weights for multivariate time series forecasting, bridging channel-independent robustness and channel-dependent expressiveness. By conditioning per-channel outputs on learnable embeddings, the method can be plugged into diverse backbones with only training-time overhead and no inference cost increase. Empirical results across eight datasets and multiple architectures show consistent improvements, especially on high-dimensional or highly correlated data, while maintaining practical training efficiency. The work highlights hypernetwork-based parameter generation as a versatile direction for enhancing existing forecasting methods in complex, real-world settings.

Abstract

Accurate forecasting of multivariate time series data remains a formidable challenge, particularly due to the growing complexity of temporal dependencies in real-world scenarios. While neural network-based models have achieved notable success in this domain, complex channel-dependent models often suffer from performance degradation compared to channel-independent models that do not consider the relationship between components but provide high robustness due to small capacity. In this work, we propose HN-MVTS, a novel architecture that integrates a hypernetwork-based generative prior with an arbitrary neural network forecasting model. The input of this hypernetwork is a learnable embedding matrix of time series components. To restrict the number of new parameters, the hypernetwork learns to generate the weights of the last layer of the target forecasting networks, serving as a data-adaptive regularizer that improves generalization and long-range predictive accuracy. The hypernetwork is used only during the training, so it does not increase the inference time compared to the base forecasting model. Extensive experiments on eight benchmark datasets demonstrate that application of HN-MVTS to the state-of-the-art models (DLinear, PatchTST, TSMixer, etc.) typically improves their performance. Our findings suggest that hypernetwork-driven parameterization offers a promising direction for enhancing existing forecasting techniques in complex scenarios.

HN-MVTS: HyperNetwork-based Multivariate Time Series Forecasting

TL;DR

HN-MVTS introduces a hypernetwork-driven prior that generates channel-specific final-layer weights for multivariate time series forecasting, bridging channel-independent robustness and channel-dependent expressiveness. By conditioning per-channel outputs on learnable embeddings, the method can be plugged into diverse backbones with only training-time overhead and no inference cost increase. Empirical results across eight datasets and multiple architectures show consistent improvements, especially on high-dimensional or highly correlated data, while maintaining practical training efficiency. The work highlights hypernetwork-based parameter generation as a versatile direction for enhancing existing forecasting methods in complex, real-world settings.

Abstract

Accurate forecasting of multivariate time series data remains a formidable challenge, particularly due to the growing complexity of temporal dependencies in real-world scenarios. While neural network-based models have achieved notable success in this domain, complex channel-dependent models often suffer from performance degradation compared to channel-independent models that do not consider the relationship between components but provide high robustness due to small capacity. In this work, we propose HN-MVTS, a novel architecture that integrates a hypernetwork-based generative prior with an arbitrary neural network forecasting model. The input of this hypernetwork is a learnable embedding matrix of time series components. To restrict the number of new parameters, the hypernetwork learns to generate the weights of the last layer of the target forecasting networks, serving as a data-adaptive regularizer that improves generalization and long-range predictive accuracy. The hypernetwork is used only during the training, so it does not increase the inference time compared to the base forecasting model. Extensive experiments on eight benchmark datasets demonstrate that application of HN-MVTS to the state-of-the-art models (DLinear, PatchTST, TSMixer, etc.) typically improves their performance. Our findings suggest that hypernetwork-driven parameterization offers a promising direction for enhancing existing forecasting techniques in complex scenarios.

Paper Structure

This paper contains 13 sections, 4 equations, 8 figures, 4 tables.

Figures (8)

  • Figure 1: Top left: Channel-dependent, Bottom left: channel-independent, and Right: proposed HN-MVTS forecasting approach, where a hypernetwork outputs parameters of the forecasting model's last layer, taking a channel embedding as an input.
  • Figure 2: Examples of adding our HN-MVTS to various forecasting models. Left: Linear model. Center: Fully-connected neural network. Right: Transformer model with attention (ATTN) and feed-forward (FFN) layers.
  • Figure 3: Embeddings for our HN-MVTS, PEMS08 dataset: (a) Initial embeddings before training, (b)-(d) learned embeddings for TSMixer, iTransformer and PatchTST.
  • Figure 4: Training curves of MSE on a test set for DLinear base model.
  • Figure 5: Training curves of MSE on a test set for TSMixer base model.
  • ...and 3 more figures