Table of Contents
Fetching ...

AdaPTS: Adapting Univariate Foundation Models to Probabilistic Multivariate Time Series Forecasting

Abdelhakim Benechehab, Vasilii Feofanov, Giuseppe Paolo, Albert Thomas, Maurizio Filippone, Balázs Kégl

TL;DR

AdaPTS tackles the challenge of using pre-trained univariate time-series foundation models for multivariate probabilistic forecasting by introducing invertible, stochastic adapters that project multivariate inputs into a latent space, apply a frozen univariate FM per channel, and decode back to the original space. By incorporating probabilistic adapters (VAE and dropout-based) and several adapter families (linear and nonlinear autoencoders, with optional normalizing flows), the framework delivers improved forecasting accuracy and calibrated uncertainty across diverse real-world datasets while enabling dimensionality reduction. Key findings show substantial MSE/MAE gains, interpretable latent representations, and reasonably well-calibrated predictive distributions, particularly with VAE-based adapters, though longer-horizon calibration remains challenging. The approach offers a modular, scalable path to broaden the applicability of time-series foundation models in practical, uncertain environments, with publicly available code to promote reproducibility.

Abstract

Pre-trained foundation models (FMs) have shown exceptional performance in univariate time series forecasting tasks. However, several practical challenges persist, including managing intricate dependencies among features and quantifying uncertainty in predictions. This study aims to tackle these critical limitations by introducing adapters; feature-space transformations that facilitate the effective use of pre-trained univariate time series FMs for multivariate tasks. Adapters operate by projecting multivariate inputs into a suitable latent space and applying the FM independently to each dimension. Inspired by the literature on representation learning and partially stochastic Bayesian neural networks, we present a range of adapters and optimization/inference strategies. Experiments conducted on both synthetic and real-world datasets confirm the efficacy of adapters, demonstrating substantial enhancements in forecasting accuracy and uncertainty quantification compared to baseline methods. Our framework, AdaPTS, positions adapters as a modular, scalable, and effective solution for leveraging time series FMs in multivariate contexts, thereby promoting their wider adoption in real-world applications. We release the code at https://github.com/abenechehab/AdaPTS.

AdaPTS: Adapting Univariate Foundation Models to Probabilistic Multivariate Time Series Forecasting

TL;DR

AdaPTS tackles the challenge of using pre-trained univariate time-series foundation models for multivariate probabilistic forecasting by introducing invertible, stochastic adapters that project multivariate inputs into a latent space, apply a frozen univariate FM per channel, and decode back to the original space. By incorporating probabilistic adapters (VAE and dropout-based) and several adapter families (linear and nonlinear autoencoders, with optional normalizing flows), the framework delivers improved forecasting accuracy and calibrated uncertainty across diverse real-world datasets while enabling dimensionality reduction. Key findings show substantial MSE/MAE gains, interpretable latent representations, and reasonably well-calibrated predictive distributions, particularly with VAE-based adapters, though longer-horizon calibration remains challenging. The approach offers a modular, scalable path to broaden the applicability of time-series foundation models in practical, uncertain environments, with publicly available code to promote reproducibility.

Abstract

Pre-trained foundation models (FMs) have shown exceptional performance in univariate time series forecasting tasks. However, several practical challenges persist, including managing intricate dependencies among features and quantifying uncertainty in predictions. This study aims to tackle these critical limitations by introducing adapters; feature-space transformations that facilitate the effective use of pre-trained univariate time series FMs for multivariate tasks. Adapters operate by projecting multivariate inputs into a suitable latent space and applying the FM independently to each dimension. Inspired by the literature on representation learning and partially stochastic Bayesian neural networks, we present a range of adapters and optimization/inference strategies. Experiments conducted on both synthetic and real-world datasets confirm the efficacy of adapters, demonstrating substantial enhancements in forecasting accuracy and uncertainty quantification compared to baseline methods. Our framework, AdaPTS, positions adapters as a modular, scalable, and effective solution for leveraging time series FMs in multivariate contexts, thereby promoting their wider adoption in real-world applications. We release the code at https://github.com/abenechehab/AdaPTS.

Paper Structure

This paper contains 30 sections, 3 theorems, 28 equations, 8 figures, 4 tables.

Key Result

Proposition 3.4

Under ass:invertible and ass:fm, the closed-form solution of the problem: writes as: where $\mathbf{W}_{\varphi}^* = \mathop{\mathrm{arg\,min}}\limits_{\mathbf{W}_{\varphi} \in \mathcal{GL}_D(\mathbb{R})} \mathcal{L} (\mathbf{W}_{\varphi})$, $\mathbf{A} = \mathbf{Y} - \mathbf{W}_{FM}^\top \mathbf{X}$, ${\bf B} = {\bf b}_{FM} \mathbf{1}^\top$, and $({\bf B}^{\top} \mathbf{A})^{+}$ d

Figures (8)

  • Figure 1: (a) Augmenting $\texttt{Moment}$ time series foundation model with the AdaPTS framework provides probabilistic and more accurate predictions. (b) The AdaPTS framework: The input time series is transformed through a feature space transformation $\varphi$ that maps into a stochastic latent space. The prediction is then conducted using a pre-trained FM before transforming back the predicted, now distribution, to the original feature space. The fire symbol indicate trainable weights while the snowflake implicates that the parameters of the FM are kept frozen.
  • Figure 2: Optimality of $\mathbf{W}_{\varphi}^*$. Comparing the MSE obtained with $\mathbf{W}_{\varphi}^*$ against the baseline, for 1000 randomly generated linear FM.
  • Figure 3: Impact of the number of components on model performance. The dashed line indicates $\texttt{Moment}$ performance without adapters, the shaded area its standard deviation, and the vertical line the number of original features.
  • Figure 4: Visualization of the latent representation obtained by different adapters (with number of components equal to 2) on Illness($H=24$). Shaded colors indicate the time dimension, with lighter colors representing earlier timesteps.
  • Figure 5: Reliability diagram for the first feature of the ETTh1 ($H=96$) dataset using $\texttt{LinearVAE}$.
  • ...and 3 more figures

Theorems & Definitions (10)

  • Definition 3.1: adapter
  • Proposition 3.4: Optimal linear adapter
  • proof
  • Remark 3.5
  • Proposition 4.1: $\texttt{VAE}\xspace$ adapter training objective
  • Remark 4.2
  • Remark 4.3
  • Proposition 1.3: Optimal linear adapter
  • proof
  • Remark 1.4