Table of Contents
Fetching ...

Self-Supervised Learning of Time Series Representation via Diffusion Process and Imputation-Interpolation-Forecasting Mask

Zineb Senane, Lele Cao, Valentin Leonhard Buchner, Yusuke Tashiro, Lei You, Pawel Herman, Mats Nordahl, Ruibo Tu, Vilhelm von Ehrenheim

TL;DR

TSDE introduces a novel diffusion-based self-supervised TSRL framework that learns general-purpose multivariate time-series embeddings by conditioning a reverse diffusion process on a learnable MTS embedding. It combines dual-orthogonal Transformer encoders with a crossover mechanism and an Imputation-Interpolation-Forecasting (IIF) mask to support imputation, interpolation, forecasting, anomaly detection, classification, and clustering, while achieving significant inference efficiency gains. Extensive experiments on real-world datasets demonstrate state-of-the-art or competitive performance across tasks, with ablations confirming the value of the crossover architecture and IIF masking. The work provides a practical, scalable approach to learning versatile TS representations, enabling robust downstream analytics even in highly missing or noisy data regimes.

Abstract

Time Series Representation Learning (TSRL) focuses on generating informative representations for various Time Series (TS) modeling tasks. Traditional Self-Supervised Learning (SSL) methods in TSRL fall into four main categories: reconstructive, adversarial, contrastive, and predictive, each with a common challenge of sensitivity to noise and intricate data nuances. Recently, diffusion-based methods have shown advanced generative capabilities. However, they primarily target specific application scenarios like imputation and forecasting, leaving a gap in leveraging diffusion models for generic TSRL. Our work, Time Series Diffusion Embedding (TSDE), bridges this gap as the first diffusion-based SSL TSRL approach. TSDE segments TS data into observed and masked parts using an Imputation-Interpolation-Forecasting (IIF) mask. It applies a trainable embedding function, featuring dual-orthogonal Transformer encoders with a crossover mechanism, to the observed part. We train a reverse diffusion process conditioned on the embeddings, designed to predict noise added to the masked part. Extensive experiments demonstrate TSDE's superiority in imputation, interpolation, forecasting, anomaly detection, classification, and clustering. We also conduct an ablation study, present embedding visualizations, and compare inference speed, further substantiating TSDE's efficiency and validity in learning representations of TS data.

Self-Supervised Learning of Time Series Representation via Diffusion Process and Imputation-Interpolation-Forecasting Mask

TL;DR

TSDE introduces a novel diffusion-based self-supervised TSRL framework that learns general-purpose multivariate time-series embeddings by conditioning a reverse diffusion process on a learnable MTS embedding. It combines dual-orthogonal Transformer encoders with a crossover mechanism and an Imputation-Interpolation-Forecasting (IIF) mask to support imputation, interpolation, forecasting, anomaly detection, classification, and clustering, while achieving significant inference efficiency gains. Extensive experiments on real-world datasets demonstrate state-of-the-art or competitive performance across tasks, with ablations confirming the value of the crossover architecture and IIF masking. The work provides a practical, scalable approach to learning versatile TS representations, enabling robust downstream analytics even in highly missing or noisy data regimes.

Abstract

Time Series Representation Learning (TSRL) focuses on generating informative representations for various Time Series (TS) modeling tasks. Traditional Self-Supervised Learning (SSL) methods in TSRL fall into four main categories: reconstructive, adversarial, contrastive, and predictive, each with a common challenge of sensitivity to noise and intricate data nuances. Recently, diffusion-based methods have shown advanced generative capabilities. However, they primarily target specific application scenarios like imputation and forecasting, leaving a gap in leveraging diffusion models for generic TSRL. Our work, Time Series Diffusion Embedding (TSDE), bridges this gap as the first diffusion-based SSL TSRL approach. TSDE segments TS data into observed and masked parts using an Imputation-Interpolation-Forecasting (IIF) mask. It applies a trainable embedding function, featuring dual-orthogonal Transformer encoders with a crossover mechanism, to the observed part. We train a reverse diffusion process conditioned on the embeddings, designed to predict noise added to the masked part. Extensive experiments demonstrate TSDE's superiority in imputation, interpolation, forecasting, anomaly detection, classification, and clustering. We also conduct an ablation study, present embedding visualizations, and compare inference speed, further substantiating TSDE's efficiency and validity in learning representations of TS data.
Paper Structure (78 sections, 42 equations, 8 figures, 13 tables, 7 algorithms)

This paper contains 78 sections, 42 equations, 8 figures, 13 tables, 7 algorithms.

Figures (8)

  • Figure 1: The TSDE architecture comprises an embedding function (left) and a conditional reverse diffusion block (right): the temporal and spatial encoders are implemented as one-layer Transformer.
  • Figure 2: Comparison of predicted and ground truth values for (a) imputation (10% missing), (b) interpolation, and (c) forecasting. The line is the median of the predictions and the red shade indicates 5%$\sim$95% quantile for missing/future values. See Appendix \ref{['appendix-vis']} for more results.
  • Figure 3: Clustering of (a) raw MTS, (b) TSDE embedding of raw MTS, and (c) TSDE embedding of TSDE-imputed MTS. Marker shapes denote ground-truth binary labels; colors indicate DBSCAN ester1996density clusters after UMAP McInnes2018 dimension reduction.
  • Figure 4: TSDE embedding visualization of (a) Trend, (b) Seasonal, and (c) Noise components from synthetic MTS.
  • Figure 5: Visualization of a MTS input $\mathbf{x}$, illustrating the formation of binary-valued evaluation masks $\mathbf{m}^{\text{gt}}$ and $\mathbf{m}$.
  • ...and 3 more figures