Wave-Mask/Mix: Exploring Wavelet-Based Augmentations for Time Series Forecasting
Dona Arabi, Jafar Bakhshaliyev, Ayse Coskuner, Kiran Madhusudhanan, Kami Serdar Uckardes
TL;DR
This paper addresses data scarcity in time series forecasting by introducing two wavelet-based augmentations, WaveMask and WaveMix, built on the discrete wavelet transform ($DWT$) to modify frequency content while preserving temporal structure. The authors integrate these augmentations into a training pipeline with a $DLinear$ backbone, and evaluate against frequency-domain and decomposition-based baselines across four datasets, including a cold-start downsampling study. Results show WaveMask and WaveMix achieve competitive performance across most forecasting horizons, often outperforming baselines and displaying improved stability, especially in low-data regimes. The work offers a simple, effective augmentation strategy for multivariate TSF with potential to extend to other backbones and domains, enhancing robustness under data scarcity.
Abstract
Data augmentation is important for improving machine learning model performance when faced with limited real-world data. In time series forecasting (TSF), where accurate predictions are crucial in fields like finance, healthcare, and manufacturing, traditional augmentation methods for classification tasks are insufficient to maintain temporal coherence. This research introduces two augmentation approaches using the discrete wavelet transform (DWT) to adjust frequency elements while preserving temporal dependencies in time series data. Our methods, Wavelet Masking (WaveMask) and Wavelet Mixing (WaveMix), are evaluated against established baselines across various forecasting horizons. To the best of our knowledge, this is the first study to conduct extensive experiments on multivariate time series using Discrete Wavelet Transform as an augmentation technique. Experimental results demonstrate that our techniques achieve competitive results with previous methods. We also explore cold-start forecasting using downsampled training datasets, comparing outcomes to baseline methods.
