Improving EEG Classification Through Randomly Reassembling Original and Generated Data with Transformer-based Diffusion Models

Mingzhi Chen; Yiyu Gui; Yuqi Su; Yuesheng Zhu; Guibo Luo; Yuchao Yang

Improving EEG Classification Through Randomly Reassembling Original and Generated Data with Transformer-based Diffusion Models

Mingzhi Chen, Yiyu Gui, Yuqi Su, Yuesheng Zhu, Guibo Luo, Yuchao Yang

TL;DR

The paper addresses the scarcity and quality issues in EEG data for classification by introducing a Transformer-based denoising diffusion probabilistic model (EEG Diffusion Transformer) augmented with Multi-Scale Convolution and Dynamic Fourier Spectrum Information to generate high-quality EEG signals. It then pairs generation with a novel Generated-Original Signals Reassembled (GO) augmentation, which reconstructs labels for generated data and fabricates vicinal data by reassembling original and generated signals, optimizing both empirical and vicinal risks. Across four EEG datasets and two backbone classifiers, the approach improves generation quality (lower EEG-FID) and classification accuracy, with notable gains on Bonn, SleepEDF-20, FACED, and Shu datasets. The method demonstrates universality across tasks, offering a practical, scalable pathway for enhancing EEG-based diagnostics and brain-computer interfaces, while acknowledging current limitations to EEG and proposing broader time-series extensions in future work.

Abstract

Electroencephalogram (EEG) classification has been widely used in various medical and engineering applications, where it is important for understanding brain function, diagnosing diseases, and assessing mental health conditions. However, the scarcity of EEG data severely restricts the performance of EEG classification networks, and generative model-based data augmentation methods have emerged as potential solutions to overcome this challenge. There are two problems with existing methods: (1) The quality of the generated EEG signals is not high; (2) The enhancement of EEG classification networks is not effective. In this paper, we propose a Transformer-based denoising diffusion probabilistic model and a generated data-based augmentation method to address the above two problems. For the characteristics of EEG signals, we propose a constant-factor scaling method to preprocess the signals, which reduces the loss of information. We incorporated Multi-Scale Convolution and Dynamic Fourier Spectrum Information modules into the model, improving the stability of the training process and the quality of the generated data. The proposed augmentation method randomly reassemble the generated data with original data in the time-domain to obtain vicinal data, which improves the model performance by minimizing the empirical risk and the vicinal risk. We verify the proposed augmentation method on four EEG datasets for four tasks and observe significant accuracy performance improvements: 14.00% on the Bonn dataset; 6.38% on the SleepEDF-20 dataset; 9.42% on the FACED dataset; 2.5% on the Shu dataset. We will make the code of our method publicly accessible soon.

Improving EEG Classification Through Randomly Reassembling Original and Generated Data with Transformer-based Diffusion Models

TL;DR

Abstract

Paper Structure (28 sections, 17 equations, 7 figures, 3 tables)

This paper contains 28 sections, 17 equations, 7 figures, 3 tables.

Introduction
Related Work
Generative Models
EEG Data Augmentation via Generative Models
EEG Data Generation and Augmentation
EEG Data Generation
Denoising Diffusion Probabilistic Models
EEG Data Preprocessing
EEG Diffusion Transformer
Generated-Original Signals Reassembled Data Augmentation
Directly Incorporating Generated Data May Not Be a Good Idea
Label Reconstruction for Generated Data
Obtain Vicinal Data by Reassembling Generated-Original Signals
Experiments
EEG Classification Network
...and 13 more sections

Figures (7)

Figure 1: The Illustration of the Proposed Method.
Figure 2: (a) depicts the overall architecture of EEG Diffusion Transformer, while (b) provides specific details of the Multi-Scale Convolution (MSC) module, Dynamic Fourier Spectrum Information (DFSI) module, and Diffusion Transformer (DiT) block.
Figure 3: Comparison of the Fourier Spectra of EEG Signals Generated by Different Models.
Figure 4: Ablation on Label Reconstruction and Go Loss
Figure 5: Real EEG signals and generated EEG signals on Bonn dataset.
...and 2 more figures

Improving EEG Classification Through Randomly Reassembling Original and Generated Data with Transformer-based Diffusion Models

TL;DR

Abstract

Improving EEG Classification Through Randomly Reassembling Original and Generated Data with Transformer-based Diffusion Models

Authors

TL;DR

Abstract

Table of Contents

Figures (7)