Enhancing EEG Signal-Based Emotion Recognition with Synthetic Data: Diffusion Model Approach

Gourav Siddhad; Masakazu Iwamura; Partha Pratim Roy

Enhancing EEG Signal-Based Emotion Recognition with Synthetic Data: Diffusion Model Approach

Gourav Siddhad, Masakazu Iwamura, Partha Pratim Roy

TL;DR

This work tackles data scarcity in EEG-based emotion recognition by introducing a conditional denoising diffusion model that augments real EEG with noise-augmented synthetic data. The method generates raw EEG-like signals and, when combined with real data, improves classifier performance across DL and non-DL models on DEAP and SADT, outperforming GAN-based and vanilla DDPM baselines. Quantitative gains are supported by qualitative analyses (t-SNE) and interpretability (SHAP), while ablation studies reveal optimal synthetic-data proportions. The approach demonstrates potential for data-efficient EEG affective computing, with future work addressing broader emotion spaces, additional datasets, and efficiency optimizations.

Abstract

Emotions are crucial in human life, influencing perceptions, relationships, behaviour, and choices. Emotion recognition using Electroencephalography (EEG) in the Brain-Computer Interface (BCI) domain presents significant challenges, particularly the need for extensive datasets. This study aims to generate synthetic EEG samples similar to real samples but distinct by augmenting noise to a conditional denoising diffusion probabilistic model, thus addressing the prevalent issue of data scarcity in EEG research. The proposed method is tested on the DEAP and SADT datasets, showcasing up to 5.6% improvement in classification accuracy when using synthetic data with DEAP and similar positive results with SADT. This is higher compared to the traditional Generative Adversarial Network (GAN) based and Denoising Diffusion Probabilistic Model (DDPM) based approaches. This study further evaluates the effectiveness of state-of-the-art classifiers on EEG data, employing both real and synthetic data with varying noise levels, and utilizes t-SNE and SHAP for detailed analysis and interpretability. The proposed diffusion-based approach for EEG data generation appears promising in refining the accuracy of emotion recognition systems and marks a notable contribution to EEG-based emotion recognition.

Enhancing EEG Signal-Based Emotion Recognition with Synthetic Data: Diffusion Model Approach

TL;DR

Abstract

Paper Structure (22 sections, 11 equations, 6 figures, 4 tables)

This paper contains 22 sections, 11 equations, 6 figures, 4 tables.

Introduction
Related Work
Emotion Recognition
Synthetic EEG Data
Methodology
Conditional Denoising Diffusion Model
Gaussian Diffusion Process
Optimizing the Denoising Model
Inference via Iterative Refinement
Model Architecture and Noise Schedulers
Augmentation Module
Classifiers
Results and Discussion
Experimental Data
Implementation Details
...and 7 more sections

Figures (6)

Figure 1: The diffusion model's training process includes forward diffusion $q$ (left to right), where Gaussian noise from a standard normal distribution is added to the real signal $\boldsymbol{x}$ in incremental steps, resulting in $\boldsymbol{x}_t$. Data generation utilizes reverse diffusion $p$ (right to left), where a denoising UNet gradually removes noise from $\boldsymbol{x}_T$, conditioned on $\boldsymbol{x}_\Delta$, to produce the generated signal $\boldsymbol{y}$. Notably, the denoising UNet accepts two inputs: the conditioning information $\boldsymbol{x}_\Delta$ and the noisy signal to be denoised, $\boldsymbol{x}_T$. Here, the Augmentation module controls the noise added to $\boldsymbol{x}$ to generate the conditioning signal $\boldsymbol{x}_{\Delta}$.
Figure 2: U-Net architecture of diffusion model. The source sample $\boldsymbol{x}$ is concatenated with the target sample $\boldsymbol{y}_t$. Self-attention is performed on 16$\times$16 feature maps.
Figure 3: Visual Comparison of Real (blue) and Synthetic (red) EEG Samples for $\Delta$: (a) $\Delta=0.01$, (b) $\Delta=0.05$, and (c) $\Delta=0.1$. The high visual similarity in temporal patterns and amplitude dynamics confirms the synthetic data effectively replicates key characteristics of real EEG signals.
Figure 4: t-SNE visualization comparing real and synthetic EEG data. The spatial closeness of clusters indicates the degree of similarity between real and synthetic samples.
Figure 5: Charts illustrating classifier accuracy for different emotional states as the proportion of diffusion-generated ($\Delta=0.01$) training data increases. Dotted lines show baseline accuracy (no synthetic data). '100%' indicates an equal number of synthetic and real samples. Chart (a) shows 10% increments of synthetic data, while (b) shows 100% and 200% increments.
...and 1 more figures

Enhancing EEG Signal-Based Emotion Recognition with Synthetic Data: Diffusion Model Approach

TL;DR

Abstract

Enhancing EEG Signal-Based Emotion Recognition with Synthetic Data: Diffusion Model Approach

Authors

TL;DR

Abstract

Table of Contents

Figures (6)