Table of Contents
Fetching ...

EEG Emotion Classification Using an Enhanced Transformer-CNN-BiLSTM Architecture with Dual Attention Mechanisms

S M Rakib UI Karim, Wenyi Lu, Diponkor Bala, Rownak Ara Rasul, Sean Goggins

TL;DR

The paper addresses EEG-based emotion recognition, a field challenged by high dimensionality, noise, and subject variability, by introducing an Enhanced Transformer-CNN-BiLSTM architecture with dual attention to model spatial-temporal EEG dynamics. The method is evaluated on a public dataset of $N=2{,}529$ samples with $D=988$ features across three emotion classes, using a rigorous 5-fold cross-validation protocol, achieving a test accuracy of $99.19\%$ and a training/test gap of $0.56\%$, indicating strong generalization. Feature analyses reveal covariance-based features as the most informative, with SHAP attributions corroborating their importance and ablation showing large performance drops when removing covariance information. Overall, the work demonstrates state-of-the-art, robust, and interpretable EEG emotion classification with potential for real-time, clinical deployment and suggests future extensions to multimodal data and cross-subject robustness.

Abstract

Electroencephalography (EEG)-based emotion recognition plays a critical role in affective computing and emerging decision-support systems, yet remains challenging due to high-dimensional, noisy, and subject-dependent signals. This study investigates whether hybrid deep learning architectures that integrate convolutional, recurrent, and attention-based components can improve emotion classification performance and robustness in EEG data. We propose an enhanced hybrid model that combines convolutional feature extraction, bidirectional temporal modeling, and self-attention mechanisms with regularization strategies to mitigate overfitting. Experiments conducted on a publicly available EEG dataset spanning three emotional states (neutral, positive, and negative) demonstrate that the proposed approach achieves state-of-the-art classification performance, significantly outperforming classical machine learning and neural baselines. Statistical tests confirm the robustness of these performance gains under cross-validation. Feature-level analyses further reveal that covariance-based EEG features contribute most strongly to emotion discrimination, highlighting the importance of inter-channel relationships in affective modeling. These findings suggest that carefully designed hybrid architectures can effectively balance predictive accuracy, robustness, and interpretability in EEG-based emotion recognition, with implications for applied affective computing and human-centered intelligent systems.

EEG Emotion Classification Using an Enhanced Transformer-CNN-BiLSTM Architecture with Dual Attention Mechanisms

TL;DR

The paper addresses EEG-based emotion recognition, a field challenged by high dimensionality, noise, and subject variability, by introducing an Enhanced Transformer-CNN-BiLSTM architecture with dual attention to model spatial-temporal EEG dynamics. The method is evaluated on a public dataset of samples with features across three emotion classes, using a rigorous 5-fold cross-validation protocol, achieving a test accuracy of and a training/test gap of , indicating strong generalization. Feature analyses reveal covariance-based features as the most informative, with SHAP attributions corroborating their importance and ablation showing large performance drops when removing covariance information. Overall, the work demonstrates state-of-the-art, robust, and interpretable EEG emotion classification with potential for real-time, clinical deployment and suggests future extensions to multimodal data and cross-subject robustness.

Abstract

Electroencephalography (EEG)-based emotion recognition plays a critical role in affective computing and emerging decision-support systems, yet remains challenging due to high-dimensional, noisy, and subject-dependent signals. This study investigates whether hybrid deep learning architectures that integrate convolutional, recurrent, and attention-based components can improve emotion classification performance and robustness in EEG data. We propose an enhanced hybrid model that combines convolutional feature extraction, bidirectional temporal modeling, and self-attention mechanisms with regularization strategies to mitigate overfitting. Experiments conducted on a publicly available EEG dataset spanning three emotional states (neutral, positive, and negative) demonstrate that the proposed approach achieves state-of-the-art classification performance, significantly outperforming classical machine learning and neural baselines. Statistical tests confirm the robustness of these performance gains under cross-validation. Feature-level analyses further reveal that covariance-based EEG features contribute most strongly to emotion discrimination, highlighting the importance of inter-channel relationships in affective modeling. These findings suggest that carefully designed hybrid architectures can effectively balance predictive accuracy, robustness, and interpretability in EEG-based emotion recognition, with implications for applied affective computing and human-centered intelligent systems.
Paper Structure (57 sections, 15 equations, 10 figures, 5 tables, 1 algorithm)

This paper contains 57 sections, 15 equations, 10 figures, 5 tables, 1 algorithm.

Figures (10)

  • Figure 1: Enhanced Transformer-CNN-BiLSTM architecture. The seven-stage pipeline comprises: (1) input processing of EEG features, (2) CNN feature extraction with residual connections, (3) bidirectional LSTM temporal modeling, (4) dual multi-head attention mechanisms, (5) dual pooling strategy, (6) deep fully connected classifier, and (7) softmax output for emotion classification. Key innovations include residual CNN blocks, dual attention layers (16 and 8 heads), and advanced regularization.
  • Figure 2: Overview of the proposed Enhanced Transformer-CNN-BiLSTM model. The network integrates spatial feature extraction (CNN), temporal sequence modeling (BiLSTM), and dual multi-head attention for importance weighting, followed by a dense classifier for emotion prediction.
  • Figure 3: Confusion Matrices for Baseline Models Showing Per-Class Classification Performance
  • Figure 4: Model Performance Comparison with Statistical Significance and Per-Class Analysis
  • Figure 5: Top 15 Features by Consensus Importance Across Five Methods
  • ...and 5 more figures