Augmentation of EEG and ECG Time Series for Deep Learning Applications: Integrating Changepoint Detection into the iAAFT Surrogates
Nina Moutonnet, Gregory Scott, Danilo P. Mandic
TL;DR
The paper addresses the challenge of limited and nonstationary EEG and ECG data for deep learning by introducing a changepoint-guided augmentation that integrates offline changepoint detection with modified iAAFT surrogates. The method segments EEG signals at quasi-stationary changes and preserves edge features, while ECG surrogate generation preserves peak morphology, enabling realistic nonstationary augmentation. Validations on EEG seizure detection (CHB-MIT, Siena) and ECG AF detection (CinC 2017) show consistent performance gains over baselines and standard iAAFT surrogates, with particularly strong improvements in EEG precision and overall accuracy. The approach is interpretable, data-efficient, and can enhance clinical DL applications by providing higher-quality synthetic nonstationary data.
Abstract
The performance of deep learning methods critically depends on the quality and quantity of the available training data. This is especially the case for physiological time series, which are both noisy and scarce, which calls for data augmentation to artificially increase the size of datasets. Another issue is that the time-evolving statistical properties of nonstationary signals prevent the use of standard data augmentation techniques. To this end, we introduce a novel method for augmenting nonstationary time series. This is achieved by combining offline changepoint detection with the iterative amplitude-adjusted Fourier transform (iAAFT), which ensures that the time-frequency properties of the original signal are preserved during augmentation. The proposed method is validated through comparisons of the performance of i) a deep learning seizure detection algorithm on both the original and augmented versions of the CHB-MIT and Siena scalp electroencephalography (EEG) databases, and ii) a deep learning atrial fibrillation (AF) detection algorithm on the original and augmented versions of the Computing in Cardiology Challenge 2017 dataset. By virtue of the proposed method, for the CHB-MIT and Siena datasets respectively, accuracy rose by 4.4% and 1.9%, precision by 10% and 5.5%, recall by 3.6% and 0.9%, and F1 by 4.2% and 1.4%. For the AF classification task, accuracy rose by 0.3%, precision by 2.1%, recall by 0.8%, and F1 by 2.1%.
