Table of Contents
Fetching ...

A Diffusion Model for Regular Time Series Generation from Irregular Data with Completion and Masking

Gal Fadlon, Idan Arbiv, Nimrod Berman, Omri Azencot

TL;DR

This work tackles generating regular time series from irregular observations by combining a Time Series Transformer (TST) based completion with a vision-based diffusion model that uses masking. The two-step approach creates natural neighborhoods before diffusion, mitigating the unnatural neighborhoods caused by zeros in naive masking and enabling robust generation of long sequences. Across 12 datasets and ultra-long sequences, the method achieves substantial gains in discriminative and predictive metrics while dramatically reducing training time, demonstrating strong practical potential for healthcare, finance, and science applications. The framework also includes thorough ablations and robustness analyses, underscoring the value of integrating completion with masking for irregular time-series generation.

Abstract

Generating realistic time series data is critical for applications in healthcare, finance, and science. However, irregular sampling and missing values present significant challenges. While prior methods address these irregularities, they often yield suboptimal results and incur high computational costs. Recent advances in regular time series generation, such as the diffusion-based ImagenTime model, demonstrate strong, fast, and scalable generative capabilities by transforming time series into image representations, making them a promising solution. However, extending ImagenTime to irregular sequences using simple masking introduces "unnatural" neighborhoods, where missing values replaced by zeros disrupt the learning process. To overcome this, we propose a novel two-step framework: first, a Time Series Transformer completes irregular sequences, creating natural neighborhoods; second, a vision-based diffusion model with masking minimizes dependence on the completed values. This approach leverages the strengths of both completion and masking, enabling robust and efficient generation of realistic time series. Our method achieves state-of-the-art performance, achieving a relative improvement in discriminative score by $70\%$ and in computational cost by $85\%$. Code is at https://github.com/azencot-group/ImagenI2R.

A Diffusion Model for Regular Time Series Generation from Irregular Data with Completion and Masking

TL;DR

This work tackles generating regular time series from irregular observations by combining a Time Series Transformer (TST) based completion with a vision-based diffusion model that uses masking. The two-step approach creates natural neighborhoods before diffusion, mitigating the unnatural neighborhoods caused by zeros in naive masking and enabling robust generation of long sequences. Across 12 datasets and ultra-long sequences, the method achieves substantial gains in discriminative and predictive metrics while dramatically reducing training time, demonstrating strong practical potential for healthcare, finance, and science applications. The framework also includes thorough ablations and robustness analyses, underscoring the value of integrating completion with masking for irregular time-series generation.

Abstract

Generating realistic time series data is critical for applications in healthcare, finance, and science. However, irregular sampling and missing values present significant challenges. While prior methods address these irregularities, they often yield suboptimal results and incur high computational costs. Recent advances in regular time series generation, such as the diffusion-based ImagenTime model, demonstrate strong, fast, and scalable generative capabilities by transforming time series into image representations, making them a promising solution. However, extending ImagenTime to irregular sequences using simple masking introduces "unnatural" neighborhoods, where missing values replaced by zeros disrupt the learning process. To overcome this, we propose a novel two-step framework: first, a Time Series Transformer completes irregular sequences, creating natural neighborhoods; second, a vision-based diffusion model with masking minimizes dependence on the completed values. This approach leverages the strengths of both completion and masking, enabling robust and efficient generation of realistic time series. Our method achieves state-of-the-art performance, achieving a relative improvement in discriminative score by and in computational cost by . Code is at https://github.com/azencot-group/ImagenI2R.

Paper Structure

This paper contains 59 sections, 6 equations, 9 figures, 20 tables.

Figures (9)

  • Figure 1: A data point (A) is mapped to an image with zeros and the coordinates in the center (B). Denoising the entire image yields inferior kernels (D) in comparison to masking (E). Constructing natural neighborhoods (C), yields consistent kernels and better scores (F).
  • Figure 2: In the first step (top), we train a TST-based autoencoder, which we use during the second step (middle), where a vision diffusion model is trained with masking over non-active pixels. Inference (bottom) is done similarly to ImagenTime.
  • Figure 3: Discriminative score vs. training time for our approach and KoVAE across different lengths ($24, 96, \text{and } 768$). Lower discriminative scores and shorter training times are better.
  • Figure 4: 2D t-SNE embeddings (top) and probability density functions (bottom) for real data vs. synthetic data from our method and KoVAE, under a 70% missing rate. From left to right: Energy (length 24), Weather (length 96), and Stock (length 768) datasets.
  • Figure 5: Comparison of inference time per sample in seconds vs. sequence length of our model and KoVae model.
  • ...and 4 more figures