Table of Contents
Fetching ...

Distillation Enhanced Time Series Forecasting Network with Momentum Contrastive Learning

Haozhi Gao, Qianqian Ren, Jinbao Li

TL;DR

DE-TSMCL, an innovative distillation enhanced framework for long sequence time series forecasting is proposed, which adaptively learns whether to mask a timestamp to obtain optimized sub-sequences and a supervised task to learn more robust representations and facilitate the contrastive learning process.

Abstract

Contrastive representation learning is crucial in time series analysis as it alleviates the issue of data noise and incompleteness as well as sparsity of supervision signal. However, existing constrastive learning frameworks usually focus on intral-temporal features, which fails to fully exploit the intricate nature of time series data. To address this issue, we propose DE-TSMCL, an innovative distillation enhanced framework for long sequence time series forecasting. Specifically, we design a learnable data augmentation mechanism which adaptively learns whether to mask a timestamp to obtain optimized sub-sequences. Then, we propose a contrastive learning task with momentum update to explore inter-sample and intra-temporal correlations of time series to learn the underlying structure feature on the unlabeled time series. Meanwhile, we design a supervised task to learn more robust representations and facilitate the contrastive learning process. Finally, we jointly optimize the above two tasks. By developing model loss from multiple tasks, we can learn effective representations for downstream forecasting task. Extensive experiments, in comparison with state-of-the-arts, well demonstrate the effectiveness of DE-TSMCL, where the maximum improvement can reach to 27.3%.

Distillation Enhanced Time Series Forecasting Network with Momentum Contrastive Learning

TL;DR

DE-TSMCL, an innovative distillation enhanced framework for long sequence time series forecasting is proposed, which adaptively learns whether to mask a timestamp to obtain optimized sub-sequences and a supervised task to learn more robust representations and facilitate the contrastive learning process.

Abstract

Contrastive representation learning is crucial in time series analysis as it alleviates the issue of data noise and incompleteness as well as sparsity of supervision signal. However, existing constrastive learning frameworks usually focus on intral-temporal features, which fails to fully exploit the intricate nature of time series data. To address this issue, we propose DE-TSMCL, an innovative distillation enhanced framework for long sequence time series forecasting. Specifically, we design a learnable data augmentation mechanism which adaptively learns whether to mask a timestamp to obtain optimized sub-sequences. Then, we propose a contrastive learning task with momentum update to explore inter-sample and intra-temporal correlations of time series to learn the underlying structure feature on the unlabeled time series. Meanwhile, we design a supervised task to learn more robust representations and facilitate the contrastive learning process. Finally, we jointly optimize the above two tasks. By developing model loss from multiple tasks, we can learn effective representations for downstream forecasting task. Extensive experiments, in comparison with state-of-the-arts, well demonstrate the effectiveness of DE-TSMCL, where the maximum improvement can reach to 27.3%.
Paper Structure (37 sections, 18 equations, 10 figures, 10 tables, 1 algorithm)

This paper contains 37 sections, 18 equations, 10 figures, 10 tables, 1 algorithm.

Figures (10)

  • Figure 1: The overall architecture of DE-TSMCL. It consists of four major components: data augmentation, representation learning, supervised task, and self-supervised task.
  • Figure 2: The design of the encoder, where the sequence follows GELU-DilatedConv-GELU-DilatedConv structure.
  • Figure 3: The effect of each component of DE-TSMCL for univariate time series forecasting.
  • Figure 4: The effect of each component of DE-TSMCL for multivariate time series forecasting.
  • Figure 5: The impact of $\lambda$ on four different datasets for univariate time series forecasting.
  • ...and 5 more figures