Table of Contents
Fetching ...

A Shapelet-based Framework for Unsupervised Multivariate Time Series Representation Learning

Zhiyu Liang, Jianfeng Zhang, Chen Liang, Hongzhi Wang, Zheng Liang, Lujia Pan

TL;DR

This work tackles unsupervised, general-purpose representation learning for multivariate time series by introducing Contrastive Shapelet Learning (CSL). CSL deploys a Shapelet Transformer to encode time series across multiple scales and measures, paired with a multi-grained contrastive objective and a multi-scale alignment loss facilitated by a diverse data augmentation library. Empirical results across 34 real-world datasets show CSL consistently outperforms competing URL methods and competes with fully supervised approaches on several tasks, while also offering interpretable shapelets. The framework is scalable to long time series and provides practical, implementable benefits for downstream tasks such as classification, clustering, and anomaly detection.

Abstract

Recent studies have shown great promise in unsupervised representation learning (URL) for multivariate time series, because URL has the capability in learning generalizable representation for many downstream tasks without using inaccessible labels. However, existing approaches usually adopt the models originally designed for other domains (e.g., computer vision) to encode the time series data and {rely on strong assumptions to design learning objectives, which limits their ability to perform well}. To deal with these problems, we propose a novel URL framework for multivariate time series by learning time-series-specific shapelet-based representation through a popular contrasting learning paradigm. To the best of our knowledge, this is the first work that explores the shapelet-based embedding in the unsupervised general-purpose representation learning. A unified shapelet-based encoder and a novel learning objective with multi-grained contrasting and multi-scale alignment are particularly designed to achieve our goal, and a data augmentation library is employed to improve the generalization. We conduct extensive experiments using tens of real-world datasets to assess the representation quality on many downstream tasks, including classification, clustering, and anomaly detection. The results demonstrate the superiority of our method against not only URL competitors, but also techniques specially designed for downstream tasks. Our code has been made publicly available at https://github.com/real2fish/CSL.

A Shapelet-based Framework for Unsupervised Multivariate Time Series Representation Learning

TL;DR

This work tackles unsupervised, general-purpose representation learning for multivariate time series by introducing Contrastive Shapelet Learning (CSL). CSL deploys a Shapelet Transformer to encode time series across multiple scales and measures, paired with a multi-grained contrastive objective and a multi-scale alignment loss facilitated by a diverse data augmentation library. Empirical results across 34 real-world datasets show CSL consistently outperforms competing URL methods and competes with fully supervised approaches on several tasks, while also offering interpretable shapelets. The framework is scalable to long time series and provides practical, implementable benefits for downstream tasks such as classification, clustering, and anomaly detection.

Abstract

Recent studies have shown great promise in unsupervised representation learning (URL) for multivariate time series, because URL has the capability in learning generalizable representation for many downstream tasks without using inaccessible labels. However, existing approaches usually adopt the models originally designed for other domains (e.g., computer vision) to encode the time series data and {rely on strong assumptions to design learning objectives, which limits their ability to perform well}. To deal with these problems, we propose a novel URL framework for multivariate time series by learning time-series-specific shapelet-based representation through a popular contrasting learning paradigm. To the best of our knowledge, this is the first work that explores the shapelet-based embedding in the unsupervised general-purpose representation learning. A unified shapelet-based encoder and a novel learning objective with multi-grained contrasting and multi-scale alignment are particularly designed to achieve our goal, and a data augmentation library is employed to improve the generalization. We conduct extensive experiments using tens of real-world datasets to assess the representation quality on many downstream tasks, including classification, clustering, and anomaly detection. The results demonstrate the superiority of our method against not only URL competitors, but also techniques specially designed for downstream tasks. Our code has been made publicly available at https://github.com/real2fish/CSL.
Paper Structure (19 sections, 14 equations, 11 figures, 11 tables)

This paper contains 19 sections, 14 equations, 11 figures, 11 tables.

Figures (11)

  • Figure 1: Overview framework of CSL.
  • Figure 2: Illustration of the data augmentation methods using a two-dimensional time series. All methods are identically performed on each dimension of the original time series.
  • Figure 3: Architecture of Shapelet Transformer (ST)
  • Figure 4: Illustration of multi-grained contrasting and multi-scale alignment. Display one shapelet at each scale for clarity.
  • Figure 5: Two-dimensional t-SNE tSNE visualization of the unsupervised learned representation for ERing test set. Classes are distinguishable using their respective marker shapes and colors.
  • ...and 6 more figures

Theorems & Definitions (2)

  • Definition 3.1: Multivariate Time Series
  • Definition 3.2: Unsupervised Representation Learning for MTS