EEGMatch: Learning with Incomplete Labels for Semi-Supervised EEG-based Cross-Subject Emotion Recognition
Rushuang Zhou, Weishan Ye, Zhiguo Zhang, Yanyang Luo, Li Zhang, Linling Li, Gan Huang, Yining Dong, Yuan-Ting Zhang, Zhen Liang
TL;DR
This work tackles label scarcity in EEG-based cross-subject emotion recognition by introducing EEGMatch, a semi-supervised framework that combines EEG-Mixup data augmentation, semi-supervised two-step pairwise learning (prototype-wise and instance-wise), and semi-supervised multi-domain adaptation to leverage both labeled and unlabeled data across source and target domains. The approach derives a target-error bound that guides optimization through a three-domain adversarial objective, integrating a gradient reversal layer to align distributions among $\mathbb{S}$, $\mathbb{U}$, and $\mathbb{T}$. Empirical results on SEED, SEED-IV, and SEED-V under incomplete-label conditions show EEGMatch achieving state-of-the-art performance, with notable improvements on SEED (6.89%) and SEED-IV (1.44%), and ablations validate the contributions of each module. The framework demonstrates robust cross-subject emotion recognition with reduced labeling requirements and provides insights into practical deployment and further improvements such as handling class imbalance and stabilizing multi-domain training.
Abstract
Electroencephalography (EEG) is an objective tool for emotion recognition and shows promising performance. However, the label scarcity problem is a main challenge in this field, which limits the wide application of EEG-based emotion recognition. In this paper, we propose a novel semi-supervised learning framework (EEGMatch) to leverage both labeled and unlabeled EEG data. First, an EEG-Mixup based data augmentation method is developed to generate more valid samples for model learning. Second, a semi-supervised two-step pairwise learning method is proposed to bridge prototype-wise and instance-wise pairwise learning, where the prototype-wise pairwise learning measures the global relationship between EEG data and the prototypical representation of each emotion class and the instance-wise pairwise learning captures the local intrinsic relationship among EEG data. Third, a semi-supervised multi-domain adaptation is introduced to align the data representation among multiple domains (labeled source domain, unlabeled source domain, and target domain), where the distribution mismatch is alleviated. Extensive experiments are conducted on two benchmark databases (SEED and SEED-IV) under a cross-subject leave-one-subject-out cross-validation evaluation protocol. The results show the proposed EEGmatch performs better than the state-of-the-art methods under different incomplete label conditions (with 6.89% improvement on SEED and 1.44% improvement on SEED-IV), which demonstrates the effectiveness of the proposed EEGMatch in dealing with the label scarcity problem in emotion recognition using EEG signals. The source code is available at https://github.com/KAZABANA/EEGMatch.
