Table of Contents
Fetching ...

A Systematic Evaluation of Generated Time Series and Their Effects in Self-Supervised Pretraining

Audrey Der, Chin-Chia Michael Yeh, Xin Dai, Huiyuan Chen, Yan Zheng, Yujie Fan, Zhongfang Zhuang, Vivian Lai, Junpeng Wang, Liang Wang, Wei Zhang, Eamonn Keogh

TL;DR

The paper addresses the data scarcity barrier in self-supervised pretraining for time series by replacing real pretraining data with large volumes of generated sequences. It systematically evaluates six generators and four contrastive pretraining methods across two backbone architectures (ResNet and Transformer) on UCR and UEA datasets. Key findings show that generated time series generally improve pretraining performance, especially when the available real data are scarce, and that ResNet-backed models often outperform Transformer-based ones. This approach offers a practical way to scale pretraining for time-series tasks and reduces reliance on labeled data, with potential for further gains via advanced generators and ensemble strategies.

Abstract

Self-supervised Pretrained Models (PTMs) have demonstrated remarkable performance in computer vision and natural language processing tasks. These successes have prompted researchers to design PTMs for time series data. In our experiments, most self-supervised time series PTMs were surpassed by simple supervised models. We hypothesize this undesired phenomenon may be caused by data scarcity. In response, we test six time series generation methods, use the generated data in pretraining in lieu of the real data, and examine the effects on classification performance. Our results indicate that replacing a real-data pretraining set with a greater volume of only generated samples produces noticeable improvement.

A Systematic Evaluation of Generated Time Series and Their Effects in Self-Supervised Pretraining

TL;DR

The paper addresses the data scarcity barrier in self-supervised pretraining for time series by replacing real pretraining data with large volumes of generated sequences. It systematically evaluates six generators and four contrastive pretraining methods across two backbone architectures (ResNet and Transformer) on UCR and UEA datasets. Key findings show that generated time series generally improve pretraining performance, especially when the available real data are scarce, and that ResNet-backed models often outperform Transformer-based ones. This approach offers a practical way to scale pretraining for time-series tasks and reduces reliance on labeled data, with potential for further gains via advanced generators and ensemble strategies.

Abstract

Self-supervised Pretrained Models (PTMs) have demonstrated remarkable performance in computer vision and natural language processing tasks. These successes have prompted researchers to design PTMs for time series data. In our experiments, most self-supervised time series PTMs were surpassed by simple supervised models. We hypothesize this undesired phenomenon may be caused by data scarcity. In response, we test six time series generation methods, use the generated data in pretraining in lieu of the real data, and examine the effects on classification performance. Our results indicate that replacing a real-data pretraining set with a greater volume of only generated samples produces noticeable improvement.
Paper Structure (9 sections, 4 equations, 10 figures, 2 tables)

This paper contains 9 sections, 4 equations, 10 figures, 2 tables.

Figures (10)

  • Figure 1: The time series generator can be utilized to synthesize data for self-supervised pretraining.
  • Figure 2: The designs of the residual block, RBlock, and the ResNet, are shown in this figure.
  • Figure 3: The designs of the transformer block, TBlock, and the Transformer, are shown in this figure.
  • Figure 4: There are two ways to use backbone models for pretraining and finetuning. All pretraining methods but TF-C use Figure \ref{['fig:pretrain']}.a. TF-C uses Figure \ref{['fig:pretrain']}.b. The projector and classifier are shown in Figures \ref{['fig:pretrain']}.c and \ref{['fig:pretrain']}.d.
  • Figure 5: Generated random walk signals.
  • ...and 5 more figures