Table of Contents
Fetching ...

DiffStyleTS: Diffusion Model for Style Transfer in Time Series

Mayank Nagda, Phil Ostheimer, Justus Arweiler, Indra Jungjohann, Jennifer Werner, Dennis Wagner, Aparna Muraleedharan, Pouya Jafari, Jochen Schmid, Fabian Jirasek, Jakob Burger, Michael Bortz, Hans Hasse, Stephan Mandt, Marius Kloft, Sophie Fellenz

TL;DR

DiffTSST tackles the challenge of time-series style transfer under data scarcity by introducing a diffusion-based framework that disentangles content and style with dedicated encoders and fuses them via a self-supervised diffusion transformer. The method supports conditional generation from separate content and style signals, enabling diverse, realistic stylized sequences without paired data. Comprehensive experiments show strong style integration, competitive realism, and substantial downstream gains in anomaly detection when used for data augmentation, including zero-shot transfers and length-extrapolating capabilities via ALiBi. The results highlight diffusion models as a powerful foundation for structured, controllable time-series generation with practical impact in domains like chemistry and energy forecasting.

Abstract

Style transfer combines the content of one signal with the style of another. It supports applications such as data augmentation and scenario simulation, helping machine learning models generalize in data-scarce domains. While well developed in vision and language, style transfer methods for time series data remain limited. We introduce DiffTSST, a diffusion-based framework that disentangles a time series into content and style representations via convolutional encoders and recombines them through a self-supervised attention-based diffusion process. At inference, encoders extract content and style from two distinct series, enabling conditional generation of novel samples to achieve style transfer. We demonstrate both qualitatively and quantitatively that DiffTSST achieves effective style transfer. We further validate its real-world utility by showing that data augmentation with DiffTSST improves anomaly detection in data-scarce regimes.

DiffStyleTS: Diffusion Model for Style Transfer in Time Series

TL;DR

DiffTSST tackles the challenge of time-series style transfer under data scarcity by introducing a diffusion-based framework that disentangles content and style with dedicated encoders and fuses them via a self-supervised diffusion transformer. The method supports conditional generation from separate content and style signals, enabling diverse, realistic stylized sequences without paired data. Comprehensive experiments show strong style integration, competitive realism, and substantial downstream gains in anomaly detection when used for data augmentation, including zero-shot transfers and length-extrapolating capabilities via ALiBi. The results highlight diffusion models as a powerful foundation for structured, controllable time-series generation with practical impact in domains like chemistry and energy forecasting.

Abstract

Style transfer combines the content of one signal with the style of another. It supports applications such as data augmentation and scenario simulation, helping machine learning models generalize in data-scarce domains. While well developed in vision and language, style transfer methods for time series data remain limited. We introduce DiffTSST, a diffusion-based framework that disentangles a time series into content and style representations via convolutional encoders and recombines them through a self-supervised attention-based diffusion process. At inference, encoders extract content and style from two distinct series, enabling conditional generation of novel samples to achieve style transfer. We demonstrate both qualitatively and quantitatively that DiffTSST achieves effective style transfer. We further validate its real-world utility by showing that data augmentation with DiffTSST improves anomaly detection in data-scarce regimes.

Paper Structure

This paper contains 64 sections, 18 equations, 6 figures, 9 tables, 2 algorithms.

Figures (6)

  • Figure 1: Overview of DiffTSST. The input series $x_{0}$ is gradually corrupted into $x_{T}$ through an iterative forward diffusion process. Content and style features are extracted by dedicated encoders $\mathcal{E}_{\phi}$ and $\mathcal{E}_{\psi}$. In the reverse diffusion process, a denoising network $\epsilon_{\theta}$ conditioned on these representations progressively reconstructs the input signal. At inference, content and style are drawn from distinct series, enabling the synthesis of new sequences via style transfer.
  • Figure 2: Content and style encoders. Left: The content encoder captures the low-frequency content by strongly downsampling the input, processing it at reduced resolution, and then upsampling to the original length, preserving global trends while discarding fine details. Right: The style encoder captures the high-frequency style using small convolutional filters constrained to avoid constant offsets and phase shifts, ensuring sensitivity to local fluctuations. Together, the two encoders disentangle global structure from local variations in the time series.
  • Figure 3: Architecture of the proposed diffusion transformer for time series style transfer. A noisy input $x_t$ is patchified and embedded, combined with a timestep embedding, and processed by a stack of SC-DiT blocks. Each block integrates information from both content and style encoders via cross-attention, while self-attention preserves temporal structure. The network predicts the residual noise $\hat{\epsilon}$ to guide reverse diffusion.
  • Figure 4: Qualitative comparison of style transfer methods across four content–style pairs (Control, Finance, Motion, Devices). Rows show the content, style, and outputs from DiffTSST (ours), wavelet, stitching, and neural optimization baselines. Only DiffTSST effectively integrates style dynamics into the generated series while preserving the underlying content, whereas baseline methods tend to either overfit to content or introduce artifacts.
  • Figure 5: PCA of embeddings of style-transferred series at different temperatures. Higher temperatures lead to greater dispersion along the PCA axes, indicating increased diversity in the generated outputs while remaining within the manifold of realistic series.
  • ...and 1 more figures

Theorems & Definitions (1)

  • Definition 1: Time Series Style Transfer (TSST)