WaveletDiff: Multilevel Wavelet Diffusion For Time Series Generation
Yu-Hsiang Wang, Olgica Milenkovic
TL;DR
WaveletDiff tackles the challenge of generating high-quality time series by operating diffusion processes directly in the wavelet domain to capture multi-scale temporal-spectral structure. It introduces per-level transformers with cross-level attention and an energy-preservation term based on Parseval's theorem, integrated into a DDPM framework. Across six real-world datasets, WaveletDiff outperforms state-of-the-art time-domain and frequency-domain baselines on multiple metrics, achieving approximately $3\times$ improvements in discriminative and Context-FID scores on average. The approach demonstrates robust multi-scale synthesis, adaptability to different wavelet bases, and reproducibility across representations, underscoring its practical impact for data augmentation, privacy preservation, and forecasting research.
Abstract
Time series are ubiquitous in many applications that involve forecasting, classification and causal inference tasks, such as healthcare, finance, audio signal processing and climate sciences. Still, large, high-quality time series datasets remain scarce. Synthetic generation can address this limitation; however, current models confined either to the time or frequency domains struggle to reproduce the inherently multi-scaled structure of real-world time series. We introduce WaveletDiff, a novel framework that trains diffusion models directly on wavelet coefficients to exploit the inherent multi-resolution structure of time series data. The model combines dedicated transformers for each decomposition level with cross-level attention mechanisms that enable selective information exchange between temporal and frequency scales through adaptive gating. It also incorporates energy preservation constraints for individual levels based on Parseval's theorem to preserve spectral fidelity throughout the diffusion process. Comprehensive tests across six real-world datasets from energy, finance, and neuroscience domains demonstrate that WaveletDiff consistently outperforms state-of-the-art time-domain and frequency-domain generative methods on both short and long time series across five diverse performance metrics. For example, WaveletDiff achieves discriminative scores and Context-FID scores that are $3\times$ smaller on average than the second-best baseline across all datasets.
