Table of Contents
Fetching ...

Augmentation of EEG and ECG Time Series for Deep Learning Applications: Integrating Changepoint Detection into the iAAFT Surrogates

Nina Moutonnet, Gregory Scott, Danilo P. Mandic

TL;DR

The paper addresses the challenge of limited and nonstationary EEG and ECG data for deep learning by introducing a changepoint-guided augmentation that integrates offline changepoint detection with modified iAAFT surrogates. The method segments EEG signals at quasi-stationary changes and preserves edge features, while ECG surrogate generation preserves peak morphology, enabling realistic nonstationary augmentation. Validations on EEG seizure detection (CHB-MIT, Siena) and ECG AF detection (CinC 2017) show consistent performance gains over baselines and standard iAAFT surrogates, with particularly strong improvements in EEG precision and overall accuracy. The approach is interpretable, data-efficient, and can enhance clinical DL applications by providing higher-quality synthetic nonstationary data.

Abstract

The performance of deep learning methods critically depends on the quality and quantity of the available training data. This is especially the case for physiological time series, which are both noisy and scarce, which calls for data augmentation to artificially increase the size of datasets. Another issue is that the time-evolving statistical properties of nonstationary signals prevent the use of standard data augmentation techniques. To this end, we introduce a novel method for augmenting nonstationary time series. This is achieved by combining offline changepoint detection with the iterative amplitude-adjusted Fourier transform (iAAFT), which ensures that the time-frequency properties of the original signal are preserved during augmentation. The proposed method is validated through comparisons of the performance of i) a deep learning seizure detection algorithm on both the original and augmented versions of the CHB-MIT and Siena scalp electroencephalography (EEG) databases, and ii) a deep learning atrial fibrillation (AF) detection algorithm on the original and augmented versions of the Computing in Cardiology Challenge 2017 dataset. By virtue of the proposed method, for the CHB-MIT and Siena datasets respectively, accuracy rose by 4.4% and 1.9%, precision by 10% and 5.5%, recall by 3.6% and 0.9%, and F1 by 4.2% and 1.4%. For the AF classification task, accuracy rose by 0.3%, precision by 2.1%, recall by 0.8%, and F1 by 2.1%.

Augmentation of EEG and ECG Time Series for Deep Learning Applications: Integrating Changepoint Detection into the iAAFT Surrogates

TL;DR

The paper addresses the challenge of limited and nonstationary EEG and ECG data for deep learning by introducing a changepoint-guided augmentation that integrates offline changepoint detection with modified iAAFT surrogates. The method segments EEG signals at quasi-stationary changes and preserves edge features, while ECG surrogate generation preserves peak morphology, enabling realistic nonstationary augmentation. Validations on EEG seizure detection (CHB-MIT, Siena) and ECG AF detection (CinC 2017) show consistent performance gains over baselines and standard iAAFT surrogates, with particularly strong improvements in EEG precision and overall accuracy. The approach is interpretable, data-efficient, and can enhance clinical DL applications by providing higher-quality synthetic nonstationary data.

Abstract

The performance of deep learning methods critically depends on the quality and quantity of the available training data. This is especially the case for physiological time series, which are both noisy and scarce, which calls for data augmentation to artificially increase the size of datasets. Another issue is that the time-evolving statistical properties of nonstationary signals prevent the use of standard data augmentation techniques. To this end, we introduce a novel method for augmenting nonstationary time series. This is achieved by combining offline changepoint detection with the iterative amplitude-adjusted Fourier transform (iAAFT), which ensures that the time-frequency properties of the original signal are preserved during augmentation. The proposed method is validated through comparisons of the performance of i) a deep learning seizure detection algorithm on both the original and augmented versions of the CHB-MIT and Siena scalp electroencephalography (EEG) databases, and ii) a deep learning atrial fibrillation (AF) detection algorithm on the original and augmented versions of the Computing in Cardiology Challenge 2017 dataset. By virtue of the proposed method, for the CHB-MIT and Siena datasets respectively, accuracy rose by 4.4% and 1.9%, precision by 10% and 5.5%, recall by 3.6% and 0.9%, and F1 by 4.2% and 1.4%. For the AF classification task, accuracy rose by 0.3%, precision by 2.1%, recall by 0.8%, and F1 by 2.1%.

Paper Structure

This paper contains 11 sections, 2 equations, 4 figures, 3 tables, 2 algorithms.

Figures (4)

  • Figure 1: Changepoint detection in nonstationary data. (a) The original F8-T4 EEG channel and the diagnostic sequence depicting the evolution of the $\alpha$-power as a function of time. Zoom-in on the section of high $\alpha$-power variations shows the value of the rolling difference $y^\alpha_{n,\lambda,\kappa}$, and the points outside of the 4 standard deviations threshold (dotted red lines). Finally, density threshold within a period equivalent to the lag determines the sub-segments of sustained variations in the original signal (solid red lines). (b) The F8-T4 EEG channel and various feature-specific changepoints. Observe that the changepoints are indicative of either a general but smoother transition into another state, or of a sharp signal variation. The latter can be seen by the identification of changepoints between 68 and 85 seconds, where each peak (indicative of a seizure), is identified as a changepoint. The changepoints used to segment the EEG before data augmentation are filtered once more, to remove any changepoints less than 256 samples (1 s) apart.
  • Figure 2: Original EEG segment and augmented EEG segment. Note that around the changepoints, the edges of the surrogate are identical to the original data.
  • Figure 3: The time-frequency spectrogram, power spectrum and histogram of the original F8-T4 EEG segment (a), an iAAFT surrogate obtained without using offline changepoint detection (b), and an iAAFT surrogate obtained using our offline changepoint detection method (c). Although the power spectrum and histograms of both (a) and (b) are similar, the time-frequency characteristics of the original signal are lost in the standard surrogate. Our surrogate has a power spectrum that differs slightly from that of the original segment, but the overall time-frequency characteristics of the original signal are well preserved.
  • Figure 4: Original ECG segment (top) and augmented ECG segment (bottom) obtained using a minimum peak distance of 60 samples, a maximum peak distance of 80 samples, and a margin $M$ of 10 samples (hyperparameter configuration B - detailed in Section \ref{['resuls_section']}). Note that the amplitude and location of the detected peaks, denoted as red crosses, are identical in the surrogate and the original data.