Table of Contents
Fetching ...

TransFusion: Generating Long, High Fidelity Time Series using Diffusion Models with Transformers

Md Fahim Sikder, Resmi Ramachandranpillai, Fredrik Heintz

TL;DR

TransFusion addresses the challenge of generating high-fidelity, long-sequence time-series by coupling a diffusion-based generative process with a Transformer encoder in the denoising network. The authors introduce two transformer-based evaluation metrics, Long-Sequence Discriminative Score (LDS) and Long-Sequence Predictive Score (LPS), along with standard metrics like Jensen-Shannon Divergence and coverage to assess fidelity, diversity, and predictive characteristics. Empirical results on four datasets show that TransFusion outperforms eight baselines across sequence lengths of $N=100$ and $N=384$, with ablations confirming that the diffusion+Transformer pairing is essential. The work demonstrates practical impact by enabling reliable long-sequence generation and providing robust evaluation tools, potentially benefiting synthetic data applications in finance, energy, and environmental domains.

Abstract

The generation of high-quality, long-sequenced time-series data is essential due to its wide range of applications. In the past, standalone Recurrent and Convolutional Neural Network-based Generative Adversarial Networks (GAN) were used to synthesize time-series data. However, they are inadequate for generating long sequences of time-series data due to limitations in the architecture. Furthermore, GANs are well known for their training instability and mode collapse problem. To address this, we propose TransFusion, a diffusion, and transformers-based generative model to generate high-quality long-sequence time-series data. We have stretched the sequence length to 384, and generated high-quality synthetic data. Also, we introduce two evaluation metrics to evaluate the quality of the synthetic data as well as its predictive characteristics. We evaluate TransFusion with a wide variety of visual and empirical metrics, and TransFusion outperforms the previous state-of-the-art by a significant margin.

TransFusion: Generating Long, High Fidelity Time Series using Diffusion Models with Transformers

TL;DR

TransFusion addresses the challenge of generating high-fidelity, long-sequence time-series by coupling a diffusion-based generative process with a Transformer encoder in the denoising network. The authors introduce two transformer-based evaluation metrics, Long-Sequence Discriminative Score (LDS) and Long-Sequence Predictive Score (LPS), along with standard metrics like Jensen-Shannon Divergence and coverage to assess fidelity, diversity, and predictive characteristics. Empirical results on four datasets show that TransFusion outperforms eight baselines across sequence lengths of and , with ablations confirming that the diffusion+Transformer pairing is essential. The work demonstrates practical impact by enabling reliable long-sequence generation and providing robust evaluation tools, potentially benefiting synthetic data applications in finance, energy, and environmental domains.

Abstract

The generation of high-quality, long-sequenced time-series data is essential due to its wide range of applications. In the past, standalone Recurrent and Convolutional Neural Network-based Generative Adversarial Networks (GAN) were used to synthesize time-series data. However, they are inadequate for generating long sequences of time-series data due to limitations in the architecture. Furthermore, GANs are well known for their training instability and mode collapse problem. To address this, we propose TransFusion, a diffusion, and transformers-based generative model to generate high-quality long-sequence time-series data. We have stretched the sequence length to 384, and generated high-quality synthetic data. Also, we introduce two evaluation metrics to evaluate the quality of the synthetic data as well as its predictive characteristics. We evaluate TransFusion with a wide variety of visual and empirical metrics, and TransFusion outperforms the previous state-of-the-art by a significant margin.
Paper Structure (26 sections, 7 equations, 3 figures, 4 tables, 2 algorithms)

This paper contains 26 sections, 7 equations, 3 figures, 4 tables, 2 algorithms.

Figures (3)

  • Figure 1: TransFusion Architecture, Transformers block that used in the TransFusion (left part), working process of diffusion (right part)
  • Figure 2: Comparison with original data and generated samples by CotGAN and TransFusion, Sequence length: 384, Energy dataset
  • Figure 3: PCA and t-SNE plots of TimeGAN, QuantGAN, CoTGAN, WaveGAN, Sig-WGAN, TimeGrad, GT-GAN, and TransFusion (ours) on the Energy Dataset (dimension = 28), Sequence Length: 100. Each dot represents a sequence of time series data, blue and orange represents real and synthetic data respectively, if the generative model learns to approximate the real data distributions, the blue and orange dots should overlap