Table of Contents
Fetching ...

TSGM: Regular and Irregular Time-series Generation using Score-based Generative Models

Haksoo Lim, Jaehoon Lee, Sewon Park, Minjung Kim, Noseong Park

TL;DR

This work adapts score-based diffusion models to time-series generation by learning a conditional score function with an autoregressive denoising objective. The framework, called TSGM, couples an encoder, a decoder, and a conditional score network trained in a latent space, enabling both regular and irregular time-series synthesis. A theoretical result links autoregressive denoising score matching to conventional conditional score learning, and empirical results on four real-world datasets show state-of-the-art discriminative and predictive performance, with strong evidence of generation diversity. While slower than some baselines due to diffusion-based sampling, TSGM demonstrates robust performance across missingness patterns, illustrating practical impact for robust time-series synthesis in realistic settings.

Abstract

Score-based generative models (SGMs) have demonstrated unparalleled sampling quality and diversity in numerous fields, such as image generation, voice synthesis, and tabular data synthesis, etc. Inspired by those outstanding results, we apply SGMs to synthesize time-series by learning its conditional score function. To this end, we present a conditional score network for time-series synthesis, deriving a denoising score matching loss tailored for our purposes. In particular, our presented denoising score matching loss is the conditional denoising score matching loss for time-series synthesis. In addition, our framework is such flexible that both regular and irregular time-series can be synthesized with minimal changes to our model design. Finally, we obtain exceptional synthesis performance on various time-series datasets, achieving state-of-the-art sampling diversity and quality.

TSGM: Regular and Irregular Time-series Generation using Score-based Generative Models

TL;DR

This work adapts score-based diffusion models to time-series generation by learning a conditional score function with an autoregressive denoising objective. The framework, called TSGM, couples an encoder, a decoder, and a conditional score network trained in a latent space, enabling both regular and irregular time-series synthesis. A theoretical result links autoregressive denoising score matching to conventional conditional score learning, and empirical results on four real-world datasets show state-of-the-art discriminative and predictive performance, with strong evidence of generation diversity. While slower than some baselines due to diffusion-based sampling, TSGM demonstrates robust performance across missingness patterns, illustrating practical impact for robust time-series synthesis in realistic settings.

Abstract

Score-based generative models (SGMs) have demonstrated unparalleled sampling quality and diversity in numerous fields, such as image generation, voice synthesis, and tabular data synthesis, etc. Inspired by those outstanding results, we apply SGMs to synthesize time-series by learning its conditional score function. To this end, we present a conditional score network for time-series synthesis, deriving a denoising score matching loss tailored for our purposes. In particular, our presented denoising score matching loss is the conditional denoising score matching loss for time-series synthesis. In addition, our framework is such flexible that both regular and irregular time-series can be synthesized with minimal changes to our model design. Finally, we obtain exceptional synthesis performance on various time-series datasets, achieving state-of-the-art sampling diversity and quality.

Paper Structure

This paper contains 37 sections, 3 theorems, 26 equations, 4 figures, 14 tables.

Key Result

Theorem 1

$l_{1}(n,s)$ can be replaced with the following $l_{2}(n,s)$ where Then, $L_{1}=L_{score}$ is satisfied.∎

Figures (4)

  • Figure 1: The KDE plots show the estimated distributions of original data and ones generated by several methods in the Air and AI4I datasets --- we ignore time stamps for drawing these distributions. Unlike baseline methods, the distribution of TSGM-VP is almost identical to the original one. These figures provide an evidence of the excellent generation quality and diversity of our method. For TSGM-subVP, similar results are observed.
  • Figure 2: The overall workflow of TSGM (see Section \ref{['sec:train']}). Our original learning objective is to approximate $\nabla\log p({\textbf{x}}_{1:n}^s|\textbf{x}_{1:n-1}^0)$, which is computationally prohibitive, with the conditional score network $M_{\theta}(s,{\textbf{x}}_{1:n}^s,\textbf{x}_{1:n-1}^0)$ using an MSE loss. We then prove in Thm. \ref{['thm1']} that learning $\nabla\log p({\textbf{x}}_{1:n}^s|\textbf{x}_{1:n}^0)$ is equivalent to $\nabla\log p({\textbf{x}}_{1:n}^s|\textbf{x}_{1:n-1}^0)$ for $\theta$ of $M_{\theta}$ in the MSE loss, i.e., their optimal model parameter $\theta$ is identical. At the end, our score network $M_{\theta}(s,\textbf{h}_{n}^s,\textbf{h}_{n-1}^0)$ learns $\nabla\log p({\textbf{h}}_{n}^s|\textbf{h}_{n})$ since RNNs can encode $\textbf{x}_{1:n}^0$ and $\textbf{x}_{1:n-1}^0$ into their hidden states $\textbf{h}_{n}^0$ and $\textbf{h}_{n-1}^0$, respectively.
  • Figure 3: t-SNE plots for TSGM (1st and 2nd columns), TimeGAN (3rd columns), TimeVAE (4th columns), GT-GAN (5th columns) in regular time-series generation. Red and blue dots mean original and synthesized samples, respectively.
  • Figure 4: Graphical representation of TimeGrad (left) and TSGM (right). We adapt TimeGrad to our generation task but its results are not comparable even to other baselines' results (see Appendix \ref{['appen:adapt']}).

Theorems & Definitions (4)

  • Theorem 1: Autoregressive denoising score matching
  • Lemma 2
  • proof
  • Theorem 1: Autoregressive denoising score matching