Table of Contents
Fetching ...

SimPSI: A Simple Strategy to Preserve Spectral Information in Time Series Data Augmentation

Hyun Ryu, Sunjae Yoon, Hee Suk Yoon, Eunseop Yoon, Chang D. Yoo

TL;DR

SimPSI addresses the problem that common time-series data augmentations distort spectral information and degrade cross-task performance. It introduces a spectrum-preserving augmentation by mixing original and augmented spectra using a per-frequency preservation map, with three map types: magnitude spectrum, saliency map, and a learned spectrum-preservative map governed by a preservation-contrastive loss. The classifier is trained with standard cross-entropy while the preservation map generator is updated separately to encourage preserving informative spectral regions, yielding improved accuracy and AUPRC across HAR, SleepEDF, and Waveform benchmarks, often outperforming random preservation. Ablation studies confirm the importance of the preservation loss and the training separation, and analyses show domain-dependent preservation patterns across datasets, suggesting SimPSI reduces augmentation-induced spectral bias and enhances practical applicability in diverse time-series tasks.

Abstract

Data augmentation is a crucial component in training neural networks to overcome the limitation imposed by data size, and several techniques have been studied for time series. Although these techniques are effective in certain tasks, they have yet to be generalized to time series benchmarks. We find that current data augmentation techniques ruin the core information contained within the frequency domain. To address this issue, we propose a simple strategy to preserve spectral information (SimPSI) in time series data augmentation. SimPSI preserves the spectral information by mixing the original and augmented input spectrum weighted by a preservation map, which indicates the importance score of each frequency. Specifically, our experimental contributions are to build three distinct preservation maps: magnitude spectrum, saliency map, and spectrum-preservative map. We apply SimPSI to various time series data augmentations and evaluate its effectiveness across a wide range of time series benchmarks. Our experimental results support that SimPSI considerably enhances the performance of time series data augmentations by preserving core spectral information. The source code used in the paper is available at https://github.com/Hyun-Ryu/simpsi.

SimPSI: A Simple Strategy to Preserve Spectral Information in Time Series Data Augmentation

TL;DR

SimPSI addresses the problem that common time-series data augmentations distort spectral information and degrade cross-task performance. It introduces a spectrum-preserving augmentation by mixing original and augmented spectra using a per-frequency preservation map, with three map types: magnitude spectrum, saliency map, and a learned spectrum-preservative map governed by a preservation-contrastive loss. The classifier is trained with standard cross-entropy while the preservation map generator is updated separately to encourage preserving informative spectral regions, yielding improved accuracy and AUPRC across HAR, SleepEDF, and Waveform benchmarks, often outperforming random preservation. Ablation studies confirm the importance of the preservation loss and the training separation, and analyses show domain-dependent preservation patterns across datasets, suggesting SimPSI reduces augmentation-induced spectral bias and enhances practical applicability in diverse time-series tasks.

Abstract

Data augmentation is a crucial component in training neural networks to overcome the limitation imposed by data size, and several techniques have been studied for time series. Although these techniques are effective in certain tasks, they have yet to be generalized to time series benchmarks. We find that current data augmentation techniques ruin the core information contained within the frequency domain. To address this issue, we propose a simple strategy to preserve spectral information (SimPSI) in time series data augmentation. SimPSI preserves the spectral information by mixing the original and augmented input spectrum weighted by a preservation map, which indicates the importance score of each frequency. Specifically, our experimental contributions are to build three distinct preservation maps: magnitude spectrum, saliency map, and spectrum-preservative map. We apply SimPSI to various time series data augmentations and evaluate its effectiveness across a wide range of time series benchmarks. Our experimental results support that SimPSI considerably enhances the performance of time series data augmentations by preserving core spectral information. The source code used in the paper is available at https://github.com/Hyun-Ryu/simpsi.
Paper Structure (41 sections, 8 equations, 17 figures, 8 tables, 1 algorithm)

This paper contains 41 sections, 8 equations, 17 figures, 8 tables, 1 algorithm.

Figures (17)

  • Figure 1: Dependency on data domain of time series data augmentation techniques. The plot shows the increment of classification accuracy of a baseline model after applying each data augmentation technique, which is evaluated on signal demodulation (Simulation), human activity recognition (HAR), and sleep stage detection (SleepEDF) tasks.
  • Figure 2: Visualization of a representative example from the HAR dataset in the time and frequency domain with various time series data augmentation techniques. Each color denotes a channel, and three channels are shown.
  • Figure 3: A SimPSI diagram. The original data is augmented randomly in the time domain. Then, the original and augmented data are both Fourier-transformed. The original spectrum is weighted by its preservation map, while the augmented spectrum is weighted by the negated preservation map, and those two are added. It is inverse-Fourier-transformed, which generates an information-preserved augmented view of the original time series data. We use a single-channel time series for better understanding, in which we visualize the real parts of the time series and magnitudes of spectra and omit channel-wise broadcasting.
  • Figure 4: Finding a set of frequencies to preserve using SimPSI (Spectrum-preservative map) during Frequency masking. The top row shows representative input magnitude spectra from the FSK8 test set. The bottom row shows the corresponding learned preservation map where the ten largest values are marked as diamonds.
  • Figure 5: Testing accuracy of a 3-layer CNN model trained on the HAR dataset using Permutation with and without SimPSI (Spectrum-preservative map) while varying the maximum number of segments.
  • ...and 12 more figures