EEG Emotion Classification Using an Enhanced Transformer-CNN-BiLSTM Architecture with Dual Attention Mechanisms
S M Rakib UI Karim, Wenyi Lu, Diponkor Bala, Rownak Ara Rasul, Sean Goggins
TL;DR
The paper addresses EEG-based emotion recognition, a field challenged by high dimensionality, noise, and subject variability, by introducing an Enhanced Transformer-CNN-BiLSTM architecture with dual attention to model spatial-temporal EEG dynamics. The method is evaluated on a public dataset of $N=2{,}529$ samples with $D=988$ features across three emotion classes, using a rigorous 5-fold cross-validation protocol, achieving a test accuracy of $99.19\%$ and a training/test gap of $0.56\%$, indicating strong generalization. Feature analyses reveal covariance-based features as the most informative, with SHAP attributions corroborating their importance and ablation showing large performance drops when removing covariance information. Overall, the work demonstrates state-of-the-art, robust, and interpretable EEG emotion classification with potential for real-time, clinical deployment and suggests future extensions to multimodal data and cross-subject robustness.
Abstract
Electroencephalography (EEG)-based emotion recognition plays a critical role in affective computing and emerging decision-support systems, yet remains challenging due to high-dimensional, noisy, and subject-dependent signals. This study investigates whether hybrid deep learning architectures that integrate convolutional, recurrent, and attention-based components can improve emotion classification performance and robustness in EEG data. We propose an enhanced hybrid model that combines convolutional feature extraction, bidirectional temporal modeling, and self-attention mechanisms with regularization strategies to mitigate overfitting. Experiments conducted on a publicly available EEG dataset spanning three emotional states (neutral, positive, and negative) demonstrate that the proposed approach achieves state-of-the-art classification performance, significantly outperforming classical machine learning and neural baselines. Statistical tests confirm the robustness of these performance gains under cross-validation. Feature-level analyses further reveal that covariance-based EEG features contribute most strongly to emotion discrimination, highlighting the importance of inter-channel relationships in affective modeling. These findings suggest that carefully designed hybrid architectures can effectively balance predictive accuracy, robustness, and interpretability in EEG-based emotion recognition, with implications for applied affective computing and human-centered intelligent systems.
