Dominant Shuffle: A Simple Yet Powerful Data Augmentation for Time-series Prediction
Kai Zhao, Zuojie He, Alex Hung, Dan Zeng
TL;DR
Dominant Shuffle introduces a simple, frequency-domain data augmentation for time-series forecasting that limits perturbations to the top-$k$ dominant frequencies and shuffles those components to reduce external noise. By applying perturbations to the frequency representation of the data-label pair $F(\omega)=\mathcal{F}([x,y])$ and reconstructing augmented samples via the inverse transform, the method preserves data-label coherence while generating diverse training examples. Extensive experiments across eight datasets and six baseline models show consistent improvements over baselines and competing augmentation methods, particularly for long-horizon forecasting. While the approach is lightweight and easy to implement, it remains heuristic with no formal theoretical justification and is currently tailored to prediction tasks due to label perturbation concerns in classification scenarios.
Abstract
Recent studies have suggested frequency-domain Data augmentation (DA) is effec tive for time series prediction. Existing frequency-domain augmentations disturb the original data with various full-spectrum noises, leading to excess domain gap between augmented and original data. Although impressive performance has been achieved in certain cases, frequency-domain DA has yet to be generalized to time series prediction datasets. In this paper, we found that frequency-domain augmentations can be significantly improved by two modifications that limit the perturbations. First, we found that limiting the perturbation to only dominant frequencies significantly outperforms full-spectrum perturbations. Dominant fre quencies represent the main periodicity and trends of the signal and are more important than other frequencies. Second, we found that simply shuffling the dominant frequency components is superior over sophisticated designed random perturbations. Shuffle rearranges the original components (magnitudes and phases) and limits the external noise. With these two modifications, we proposed dominant shuffle, a simple yet effective data augmentation for time series prediction. Our method is very simple yet powerful and can be implemented with just a few lines of code. Extensive experiments with eight datasets and six popular time series models demonstrate that our method consistently improves the baseline performance under various settings and significantly outperforms other DA methods. Code can be accessed at https://kaizhao.net/time-series.
