Table of Contents
Fetching ...

Distributional Drift Adaptation with Temporal Conditional Variational Autoencoder for Multivariate Time Series Forecasting

Hui He, Qi Zhang, Kun Yi, Kaize Shi, Zhendong Niu, Longbing Cao

TL;DR

This work tackles the challenge of distribution drift in non-stationary multivariate time series forecasting by modeling the evolving conditional distribution $p(\mathcal{Y}|\mathcal{X},\mathcal{C})$ through a Temporal Conditional Variational Autoencoder (TCVAE). It introduces a temporal Hawkes attention to capture temporal factors, a gated attention mechanism to adapt Transformer encoder-decoder architectures, and a conditional continuous normalizing flow to transform latent Gaussian variables into flexible, form-free distributions. Empirical results on six real-world datasets demonstrate robust, state-of-the-art forecasting performance, with ablations confirming the importance of each component. The approach offers practical benefits for drift-aware prediction in complex, high-dimensional MTS domains and points to future work on varying drift frequencies and broader applications.

Abstract

Due to the non-stationary nature, the distribution of real-world multivariate time series (MTS) changes over time, which is known as distribution drift. Most existing MTS forecasting models greatly suffer from distribution drift and degrade the forecasting performance over time. Existing methods address distribution drift via adapting to the latest arrived data or self-correcting per the meta knowledge derived from future data. Despite their great success in MTS forecasting, these methods hardly capture the intrinsic distribution changes, especially from a distributional perspective. Accordingly, we propose a novel framework temporal conditional variational autoencoder (TCVAE) to model the dynamic distributional dependencies over time between historical observations and future data in MTSs and infer the dependencies as a temporal conditional distribution to leverage latent variables. Specifically, a novel temporal Hawkes attention mechanism represents temporal factors subsequently fed into feed-forward networks to estimate the prior Gaussian distribution of latent variables. The representation of temporal factors further dynamically adjusts the structures of Transformer-based encoder and decoder to distribution changes by leveraging a gated attention mechanism. Moreover, we introduce conditional continuous normalization flow to transform the prior Gaussian to a complex and form-free distribution to facilitate flexible inference of the temporal conditional distribution. Extensive experiments conducted on six real-world MTS datasets demonstrate the TCVAE's superior robustness and effectiveness over the state-of-the-art MTS forecasting baselines. We further illustrate the TCVAE applicability through multifaceted case studies and visualization in real-world scenarios.

Distributional Drift Adaptation with Temporal Conditional Variational Autoencoder for Multivariate Time Series Forecasting

TL;DR

This work tackles the challenge of distribution drift in non-stationary multivariate time series forecasting by modeling the evolving conditional distribution through a Temporal Conditional Variational Autoencoder (TCVAE). It introduces a temporal Hawkes attention to capture temporal factors, a gated attention mechanism to adapt Transformer encoder-decoder architectures, and a conditional continuous normalizing flow to transform latent Gaussian variables into flexible, form-free distributions. Empirical results on six real-world datasets demonstrate robust, state-of-the-art forecasting performance, with ablations confirming the importance of each component. The approach offers practical benefits for drift-aware prediction in complex, high-dimensional MTS domains and points to future work on varying drift frequencies and broader applications.

Abstract

Due to the non-stationary nature, the distribution of real-world multivariate time series (MTS) changes over time, which is known as distribution drift. Most existing MTS forecasting models greatly suffer from distribution drift and degrade the forecasting performance over time. Existing methods address distribution drift via adapting to the latest arrived data or self-correcting per the meta knowledge derived from future data. Despite their great success in MTS forecasting, these methods hardly capture the intrinsic distribution changes, especially from a distributional perspective. Accordingly, we propose a novel framework temporal conditional variational autoencoder (TCVAE) to model the dynamic distributional dependencies over time between historical observations and future data in MTSs and infer the dependencies as a temporal conditional distribution to leverage latent variables. Specifically, a novel temporal Hawkes attention mechanism represents temporal factors subsequently fed into feed-forward networks to estimate the prior Gaussian distribution of latent variables. The representation of temporal factors further dynamically adjusts the structures of Transformer-based encoder and decoder to distribution changes by leveraging a gated attention mechanism. Moreover, we introduce conditional continuous normalization flow to transform the prior Gaussian to a complex and form-free distribution to facilitate flexible inference of the temporal conditional distribution. Extensive experiments conducted on six real-world MTS datasets demonstrate the TCVAE's superior robustness and effectiveness over the state-of-the-art MTS forecasting baselines. We further illustrate the TCVAE applicability through multifaceted case studies and visualization in real-world scenarios.
Paper Structure (24 sections, 27 equations, 7 figures, 6 tables, 1 algorithm)

This paper contains 24 sections, 27 equations, 7 figures, 6 tables, 1 algorithm.

Figures (7)

  • Figure 1: An example of MTS traffic flow derived from METR-LA.
  • Figure 2: The overall architecture of TCVAE for MTS forecasting. (1) The input $\mathcal{X}_t$ is first fed into the input representation module and $\{\bm{u}_{t-w+1},...,\bm{u}_{\tau},...,\bm{u}_t\}$ is extracted separately for temporal factor representation; (2) The encoder $\mathcal{T}_e$ takes $\bm{{\rm X}}$ as input and controls the information flow output from multiple heads by introducing temporal factors $\bm{{\rm C}}$ into gated attention. The corresponding details of the decoder $\mathcal{T}_d$ are similar to the encoder $\mathcal{T}_e$ with input $\tilde{\bm{{\rm X}}}$ and $\tilde{\bm{{\rm C}}}$; (3) The final loss is a combination of forecasting loss, backcasting loss, and $\mathbb{KL}$ divergence.
  • Figure 3: Parameters sensitivity analysis of TCVAE on MAE and MAPE.
  • Figure 4: Distribution of different time windows, METR-LA. The left part is a comparison of flexible and Gaussian posterior in FDA. Comparisons of the target distribution, predicted distribution, and Gaussian distribution are shown in the right part.
  • Figure 5: Visualization of correlation matrix at 8:00-10:00 A.M. (the 96$th$ window) and 10:00-12:00 P.M. (the 288$th$ window) on 9/1/2018, METR-LA.
  • ...and 2 more figures