A Diffusion Model for Regular Time Series Generation from Irregular Data with Completion and Masking
Gal Fadlon, Idan Arbiv, Nimrod Berman, Omri Azencot
TL;DR
This work tackles generating regular time series from irregular observations by combining a Time Series Transformer (TST) based completion with a vision-based diffusion model that uses masking. The two-step approach creates natural neighborhoods before diffusion, mitigating the unnatural neighborhoods caused by zeros in naive masking and enabling robust generation of long sequences. Across 12 datasets and ultra-long sequences, the method achieves substantial gains in discriminative and predictive metrics while dramatically reducing training time, demonstrating strong practical potential for healthcare, finance, and science applications. The framework also includes thorough ablations and robustness analyses, underscoring the value of integrating completion with masking for irregular time-series generation.
Abstract
Generating realistic time series data is critical for applications in healthcare, finance, and science. However, irregular sampling and missing values present significant challenges. While prior methods address these irregularities, they often yield suboptimal results and incur high computational costs. Recent advances in regular time series generation, such as the diffusion-based ImagenTime model, demonstrate strong, fast, and scalable generative capabilities by transforming time series into image representations, making them a promising solution. However, extending ImagenTime to irregular sequences using simple masking introduces "unnatural" neighborhoods, where missing values replaced by zeros disrupt the learning process. To overcome this, we propose a novel two-step framework: first, a Time Series Transformer completes irregular sequences, creating natural neighborhoods; second, a vision-based diffusion model with masking minimizes dependence on the completed values. This approach leverages the strengths of both completion and masking, enabling robust and efficient generation of realistic time series. Our method achieves state-of-the-art performance, achieving a relative improvement in discriminative score by $70\%$ and in computational cost by $85\%$. Code is at https://github.com/azencot-group/ImagenI2R.
