Time Series Generation Under Data Scarcity: A Unified Generative Modeling Approach
Tal Gonen, Itai Pemper, Ilan Naiman, Nimrod Berman, Omri Azencot
TL;DR
This work tackles the data scarcity challenge in time-series generation by proposing a unified diffusion-based framework trained on a large, heterogeneous corpus. It introduces dynamic channel adaptation (DyConv) and dataset token conditioning to enable cross-domain, few-shot generation with a single model, pre-trained across diverse domains and finetuned on new tasks with minimal data. Extensive benchmarks across 12 datasets demonstrate substantial gains over state-of-the-art baselines in few-shot and long-horizon settings, and ablations confirm the critical roles of DyConv and domain conditioning. The findings advocate for unified pre-training in time-series generative modeling and provide a benchmark and analysis to guide future research toward scalable, cross-domain diffusion-based generators.
Abstract
Generative modeling of time series is a central challenge in time series analysis, particularly under data-scarce conditions. Despite recent advances in generative modeling, a comprehensive understanding of how state-of-the-art generative models perform under limited supervision remains lacking. In this work, we conduct the first large-scale study evaluating leading generative models in data-scarce settings, revealing a substantial performance gap between full-data and data-scarce regimes. To close this gap, we propose a unified diffusion-based generative framework that can synthesize high-fidelity time series across diverse domains using just a few examples. Our model is pre-trained on a large, heterogeneous collection of time series datasets, enabling it to learn generalizable temporal representations. It further incorporates architectural innovations such as dynamic convolutional layers for flexible channel adaptation and dataset token conditioning for domain-aware generation. Without requiring abundant supervision, our unified model achieves state-of-the-art performance in few-shot settings-outperforming domain-specific baselines across a wide range of subset sizes. Remarkably, it also surpasses all baselines even when tested on full datasets benchmarks, highlighting the strength of pre-training and cross-domain generalization. We hope this work encourages the community to revisit few-shot generative modeling as a key problem in time series research and pursue unified solutions that scale efficiently across domains. Code is available at https://github.com/azencot-group/ImagenFew.
