Table of Contents
Fetching ...

Conditionally Whitened Generative Models for Probabilistic Time Series Forecasting

Yanfeng Yang, Siwei Chen, Pingping Hu, Zhaotong Shen, Yingjie Zhang, Zhuoran Sun, Shuai Li, Ziqi Chen, Kenji Fukumizu

TL;DR

This work addresses probabilistic forecasting for multivariate time series under non-stationarity and distribution shift by introducing Conditionally Whitened Generative Models (CW-Gen). CW-Gen combines a Joint Mean–Covariance Estimator (JMCE) with conditional whitening to create CW-Diff and CW-Flow, enabling priors based on the conditional mean and sliding-window covariance to guide diffusion and flow generation. The authors derive a sufficient KL-divergence bound showing when a learned conditional Gaussian terminal distribution improves sample quality, and they implement JMCE to produce these priors with PSD covariance control. Empirically, CW-Gen improves probabilistic and, in many cases, point forecasting performance across five real-world datasets and six baselines, while mitigating distribution shift; the framework is extensible to other diffusion/flow models and supports end-to-end training. Overall, CW-Gen provides a principled, modular approach to integrate informative priors into generative time-series models for more accurate and robust forecasting.

Abstract

Probabilistic forecasting of multivariate time series is challenging due to non-stationarity, inter-variable dependencies, and distribution shifts. While recent diffusion and flow matching models have shown promise, they often ignore informative priors such as conditional means and covariances. In this work, we propose Conditionally Whitened Generative Models (CW-Gen), a framework that incorporates prior information through conditional whitening. Theoretically, we establish sufficient conditions under which replacing the traditional terminal distribution of diffusion models, namely the standard multivariate normal, with a multivariate normal distribution parameterized by estimators of the conditional mean and covariance improves sample quality. Guided by this analysis, we design a novel Joint Mean-Covariance Estimator (JMCE) that simultaneously learns the conditional mean and sliding-window covariance. Building on JMCE, we introduce Conditionally Whitened Diffusion Models (CW-Diff) and extend them to Conditionally Whitened Flow Matching (CW-Flow). Experiments on five real-world datasets with six state-of-the-art generative models demonstrate that CW-Gen consistently enhances predictive performance, capturing non-stationary dynamics and inter-variable correlations more effectively than prior-free approaches. Empirical results further demonstrate that CW-Gen can effectively mitigate the effects of distribution shift.

Conditionally Whitened Generative Models for Probabilistic Time Series Forecasting

TL;DR

This work addresses probabilistic forecasting for multivariate time series under non-stationarity and distribution shift by introducing Conditionally Whitened Generative Models (CW-Gen). CW-Gen combines a Joint Mean–Covariance Estimator (JMCE) with conditional whitening to create CW-Diff and CW-Flow, enabling priors based on the conditional mean and sliding-window covariance to guide diffusion and flow generation. The authors derive a sufficient KL-divergence bound showing when a learned conditional Gaussian terminal distribution improves sample quality, and they implement JMCE to produce these priors with PSD covariance control. Empirically, CW-Gen improves probabilistic and, in many cases, point forecasting performance across five real-world datasets and six baselines, while mitigating distribution shift; the framework is extensible to other diffusion/flow models and supports end-to-end training. Overall, CW-Gen provides a principled, modular approach to integrate informative priors into generative time-series models for more accurate and robust forecasting.

Abstract

Probabilistic forecasting of multivariate time series is challenging due to non-stationarity, inter-variable dependencies, and distribution shifts. While recent diffusion and flow matching models have shown promise, they often ignore informative priors such as conditional means and covariances. In this work, we propose Conditionally Whitened Generative Models (CW-Gen), a framework that incorporates prior information through conditional whitening. Theoretically, we establish sufficient conditions under which replacing the traditional terminal distribution of diffusion models, namely the standard multivariate normal, with a multivariate normal distribution parameterized by estimators of the conditional mean and covariance improves sample quality. Guided by this analysis, we design a novel Joint Mean-Covariance Estimator (JMCE) that simultaneously learns the conditional mean and sliding-window covariance. Building on JMCE, we introduce Conditionally Whitened Diffusion Models (CW-Diff) and extend them to Conditionally Whitened Flow Matching (CW-Flow). Experiments on five real-world datasets with six state-of-the-art generative models demonstrate that CW-Gen consistently enhances predictive performance, capturing non-stationary dynamics and inter-variable correlations more effectively than prior-free approaches. Empirical results further demonstrate that CW-Gen can effectively mitigate the effects of distribution shift.

Paper Structure

This paper contains 36 sections, 3 theorems, 47 equations, 5 figures, 21 tables.

Key Result

Theorem 1

Let $P_{X|C}$ denote the true conditional distribution of $X \in \mathbb{R}^{d_x}$ given $C$, with conditional mean $\mu_{X|C}$ and positive-definite conditional covariance $\Sigma_{X|C}$. Define $Q_0 := N(0,I_{d_x})$ and $\widehat{Q} := N( \widehat{\mu}_{X|C}, \widehat{\Sigma}_{X|C} )$, where $\wid where $\left\| \Sigma_{X|C} - \widehat{\Sigma}_{X|C} \right\|_N=\sum_{i=1}^{d_x} \widetilde{s}_i$ a

Figures (5)

  • Figure 1: The flow chat of JMCE, CW-Diff and CW-Flow.
  • Figure 2: Comparison of Diffusion-TS, NsDiff, FlowTS, and their CW variants on ETTh1 across Dimensions 1 and 2. True ETTh1 means the real time series from ETTh1 dataset. Sample mean and standrad deviation refer to the mean and standrad deviation of 100 samples generated by generative models. One sample refers to a randomly chosen instance among the 100 generated samples.
  • Figure 3: Comparison of all models on ETTh1, ETTh2, ILI and Weather.
  • Figure 4: Comparison between the learning targets and the predictions of JMCE (top: training set, bottom: test set).
  • Figure 5: Comparison of DSPD, CW-SSSD, TsFlow and CW-FlowTS on the first dimension of ETTh1.

Theorems & Definitions (3)

  • Theorem 1
  • Theorem 2
  • Lemma 1