Table of Contents
Fetching ...

Segment, Shuffle, and Stitch: A Simple Layer for Improving Time-Series Representations

Shivam Grover, Amin Jalali, Ali Etemad

TL;DR

This work proposes a simple plug-and-play neural network layer called Segment, Shuffle, and Stitch (S3) designed to improve representation learning in time-series models, and shows that incorporating S3 results in significant improvements for the tasks of time-series classification, forecasting, and anomaly detection.

Abstract

Existing approaches for learning representations of time-series keep the temporal arrangement of the time-steps intact with the presumption that the original order is the most optimal for learning. However, non-adjacent sections of real-world time-series may have strong dependencies. Accordingly, we raise the question: Is there an alternative arrangement for time-series which could enable more effective representation learning? To address this, we propose a simple plug-and-play neural network layer called Segment, Shuffle, and Stitch (S3) designed to improve representation learning in time-series models. S3 works by creating non-overlapping segments from the original sequence and shuffling them in a learned manner that is optimal for the task at hand. It then re-attaches the shuffled segments back together and performs a learned weighted sum with the original input to capture both the newly shuffled sequence along with the original sequence. S3 is modular and can be stacked to achieve different levels of granularity, and can be added to many forms of neural architectures including CNNs or Transformers with negligible computation overhead. Through extensive experiments on several datasets and state-of-the-art baselines, we show that incorporating S3 results in significant improvements for the tasks of time-series classification, forecasting, and anomaly detection, improving performance on certain datasets by up to 68\%. We also show that S3 makes the learning more stable with a smoother training loss curve and loss landscape compared to the original baseline. The code is available at https://github.com/shivam-grover/S3-TimeSeries.

Segment, Shuffle, and Stitch: A Simple Layer for Improving Time-Series Representations

TL;DR

This work proposes a simple plug-and-play neural network layer called Segment, Shuffle, and Stitch (S3) designed to improve representation learning in time-series models, and shows that incorporating S3 results in significant improvements for the tasks of time-series classification, forecasting, and anomaly detection.

Abstract

Existing approaches for learning representations of time-series keep the temporal arrangement of the time-steps intact with the presumption that the original order is the most optimal for learning. However, non-adjacent sections of real-world time-series may have strong dependencies. Accordingly, we raise the question: Is there an alternative arrangement for time-series which could enable more effective representation learning? To address this, we propose a simple plug-and-play neural network layer called Segment, Shuffle, and Stitch (S3) designed to improve representation learning in time-series models. S3 works by creating non-overlapping segments from the original sequence and shuffling them in a learned manner that is optimal for the task at hand. It then re-attaches the shuffled segments back together and performs a learned weighted sum with the original input to capture both the newly shuffled sequence along with the original sequence. S3 is modular and can be stacked to achieve different levels of granularity, and can be added to many forms of neural architectures including CNNs or Transformers with negligible computation overhead. Through extensive experiments on several datasets and state-of-the-art baselines, we show that incorporating S3 results in significant improvements for the tasks of time-series classification, forecasting, and anomaly detection, improving performance on certain datasets by up to 68\%. We also show that S3 makes the learning more stable with a smoother training loss curve and loss landscape compared to the original baseline. The code is available at https://github.com/shivam-grover/S3-TimeSeries.
Paper Structure (12 sections, 6 equations, 14 figures, 15 tables)

This paper contains 12 sections, 6 equations, 14 figures, 15 tables.

Figures (14)

  • Figure 1: Stacking S3 layers. In this depiction, we use $n = 2$, $\phi = 3$, and $\theta = 2$ as the hyperparameters.
  • Figure 2: t-SNE visualizations of the learned representations of TS2Vec and TS2Vec+S3 for 4 randomly chosen test sets. Different colors represent different classes. It can be seen that representations belonging to different classes are more separable after adding S3.
  • Figure 3: Forecasting output by Informer and Informer+S3 for a sample from ETTh1.
  • Figure 4: Training loss against iterations on two sample datasets from the UCR archive.
  • Figure 5: Visualisation of the loss landscape for the Beef dataset from the UCR archive.
  • ...and 9 more figures