Table of Contents
Fetching ...

Self-Supervised Learning for Time Series Analysis: Taxonomy, Progress, and Prospects

Kexin Zhang, Qingsong Wen, Chaoli Zhang, Rongyao Cai, Ming Jin, Yong Liu, James Zhang, Yuxuan Liang, Guansong Pang, Dongjin Song, Shirui Pan

TL;DR

This work surveys self-supervised learning for time series, addressing the gap of a comprehensive, taxonomy-driven review. It categorizes methods into generative-based, contrastive-based, and adversarial-based SSL, detailing ten subcategories such as autoregressive forecasting, autoencoder reconstruction, diffusion-based generation, sampling/prediction/augmentation/prototype/expert knowledge contrast, and generation/imputation plus auxiliary representation enhancement. The paper also covers applications and datasets across anomaly detection, forecasting, and classification/clustering, and discusses future directions like data augmentation strategies, inductive biases, irregular/sparse data handling, pretraining of large models, robustness to adversarial attacks, benchmarks, and collaborative systems. By linking methodological choices to downstream tasks and datasets, the review aims to guide practitioners and researchers in selecting effective SSL strategies for time series analysis. The synthesis of taxonomy, datasets, and actionable directions underscores SSL's potential to improve data efficiency and generalization in real-world time series tasks.

Abstract

Self-supervised learning (SSL) has recently achieved impressive performance on various time series tasks. The most prominent advantage of SSL is that it reduces the dependence on labeled data. Based on the pre-training and fine-tuning strategy, even a small amount of labeled data can achieve high performance. Compared with many published self-supervised surveys on computer vision and natural language processing, a comprehensive survey for time series SSL is still missing. To fill this gap, we review current state-of-the-art SSL methods for time series data in this article. To this end, we first comprehensively review existing surveys related to SSL and time series, and then provide a new taxonomy of existing time series SSL methods by summarizing them from three perspectives: generative-based, contrastive-based, and adversarial-based. These methods are further divided into ten subcategories with detailed reviews and discussions about their key intuitions, main frameworks, advantages and disadvantages. To facilitate the experiments and validation of time series SSL methods, we also summarize datasets commonly used in time series forecasting, classification, anomaly detection, and clustering tasks. Finally, we present the future directions of SSL for time series analysis.

Self-Supervised Learning for Time Series Analysis: Taxonomy, Progress, and Prospects

TL;DR

This work surveys self-supervised learning for time series, addressing the gap of a comprehensive, taxonomy-driven review. It categorizes methods into generative-based, contrastive-based, and adversarial-based SSL, detailing ten subcategories such as autoregressive forecasting, autoencoder reconstruction, diffusion-based generation, sampling/prediction/augmentation/prototype/expert knowledge contrast, and generation/imputation plus auxiliary representation enhancement. The paper also covers applications and datasets across anomaly detection, forecasting, and classification/clustering, and discusses future directions like data augmentation strategies, inductive biases, irregular/sparse data handling, pretraining of large models, robustness to adversarial attacks, benchmarks, and collaborative systems. By linking methodological choices to downstream tasks and datasets, the review aims to guide practitioners and researchers in selecting effective SSL strategies for time series analysis. The synthesis of taxonomy, datasets, and actionable directions underscores SSL's potential to improve data efficiency and generalization in real-world time series tasks.

Abstract

Self-supervised learning (SSL) has recently achieved impressive performance on various time series tasks. The most prominent advantage of SSL is that it reduces the dependence on labeled data. Based on the pre-training and fine-tuning strategy, even a small amount of labeled data can achieve high performance. Compared with many published self-supervised surveys on computer vision and natural language processing, a comprehensive survey for time series SSL is still missing. To fill this gap, we review current state-of-the-art SSL methods for time series data in this article. To this end, we first comprehensively review existing surveys related to SSL and time series, and then provide a new taxonomy of existing time series SSL methods by summarizing them from three perspectives: generative-based, contrastive-based, and adversarial-based. These methods are further divided into ten subcategories with detailed reviews and discussions about their key intuitions, main frameworks, advantages and disadvantages. To facilitate the experiments and validation of time series SSL methods, we also summarize datasets commonly used in time series forecasting, classification, anomaly detection, and clustering tasks. Finally, we present the future directions of SSL for time series analysis.
Paper Structure (66 sections, 31 equations, 6 figures, 8 tables)

This paper contains 66 sections, 31 equations, 6 figures, 8 tables.

Figures (6)

  • Figure 1: The proposed taxonomy of SSL for time series data.
  • Figure 2: Three categories of generative-based SSL for time series data.
  • Figure 3: Five categories of contrastive-based SSL for time series data.
  • Figure 4: Three categories of adversarial-based SSL for time series data.
  • Figure 5: Learning paradigms of SSL.
  • ...and 1 more figures