Table of Contents
Fetching ...

EmT: A Novel Transformer for Generalized Cross-subject EEG Emotion Recognition

Yi Ding, Chengxuan Tong, Shuailei Zhang, Muyun Jiang, Yong Li, Kevin Lim Jun Liang, Cuntai Guan

TL;DR

A novel transformer model called emotion transformer (EmT) is introduced, designed to excel in both generalized cross-subject electroencephalography (EEG) emotion classification and regression tasks and achieves higher results than the baseline methods.

Abstract

Integrating prior knowledge of neurophysiology into neural network architecture enhances the performance of emotion decoding. While numerous techniques emphasize learning spatial and short-term temporal patterns, there has been limited emphasis on capturing the vital long-term contextual information associated with emotional cognitive processes. In order to address this discrepancy, we introduce a novel transformer model called emotion transformer (EmT). EmT is designed to excel in both generalized cross-subject EEG emotion classification and regression tasks. In EmT, EEG signals are transformed into a temporal graph format, creating a sequence of EEG feature graphs using a temporal graph construction module (TGC). A novel residual multi-view pyramid GCN module (RMPG) is then proposed to learn dynamic graph representations for each EEG feature graph within the series, and the learned representations of each graph are fused into one token. Furthermore, we design a temporal contextual transformer module (TCT) with two types of token mixers to learn the temporal contextual information. Finally, the task-specific output module (TSO) generates the desired outputs. Experiments on four publicly available datasets show that EmT achieves higher results than the baseline methods for both EEG emotion classification and regression tasks. The code is available at https://github.com/yi-ding-cs/EmT.

EmT: A Novel Transformer for Generalized Cross-subject EEG Emotion Recognition

TL;DR

A novel transformer model called emotion transformer (EmT) is introduced, designed to excel in both generalized cross-subject electroencephalography (EEG) emotion classification and regression tasks and achieves higher results than the baseline methods.

Abstract

Integrating prior knowledge of neurophysiology into neural network architecture enhances the performance of emotion decoding. While numerous techniques emphasize learning spatial and short-term temporal patterns, there has been limited emphasis on capturing the vital long-term contextual information associated with emotional cognitive processes. In order to address this discrepancy, we introduce a novel transformer model called emotion transformer (EmT). EmT is designed to excel in both generalized cross-subject EEG emotion classification and regression tasks. In EmT, EEG signals are transformed into a temporal graph format, creating a sequence of EEG feature graphs using a temporal graph construction module (TGC). A novel residual multi-view pyramid GCN module (RMPG) is then proposed to learn dynamic graph representations for each EEG feature graph within the series, and the learned representations of each graph are fused into one token. Furthermore, we design a temporal contextual transformer module (TCT) with two types of token mixers to learn the temporal contextual information. Finally, the task-specific output module (TSO) generates the desired outputs. Experiments on four publicly available datasets show that EmT achieves higher results than the baseline methods for both EEG emotion classification and regression tasks. The code is available at https://github.com/yi-ding-cs/EmT.
Paper Structure (41 sections, 18 equations, 9 figures, 4 tables)

This paper contains 41 sections, 18 equations, 9 figures, 4 tables.

Figures (9)

  • Figure 1: The network structure of EmT. The temporal graphs from TGC are used as the input to RMPG that will transfer each graph into one token embedding. Then TCT extract the temporal contextual information via specially designed token mixers. We propose two types of TCT structures, named TCT-Clas and TCT-Regr, for classification and regression tasks separately. A mean fusion is applied before feeding the learned embeddings into MLP head for the classification output. For regression tasks, a MLP head projects each embedding in the sequence into a scalar to generate a sequence that can be used to regress the temporally continuous labels.
  • Figure 2: Illustration of TGC. Each segment, $\bar{X}$, is split into several sub-segment, $\tilde{X}$. Features in different frequency bands are extracted for each $\tilde{X}$ channel by channel to form $\boldsymbol{F}$. Then each EEG channel is regarded as a node, and the extracted features are treated as node attributes. Combing all the graphs which are in time order, we get the temporal graphs, $\boldsymbol{G}_{T}$.
  • Figure 3: Effect of feature types on emotion classification and regression performances of EmT using SEED and MAHNOB-HCI. Using rPSD gives the overall best performances. We don't add PSD results for regression tasks in (b) because the model cannot converge.
  • Figure 4: Effect of the depth (a) and width (b) of GCNs in RMPG on classification performance using SEED. For the depth analysis, 1,1 indicates that both GCN branches in RMPG have 1 layer each. The width is the hidden size of GCN layer.
  • Figure 5: Effect of the number of TCT blocks on emotion classification and regression performances of EmT using SEED and MAHNOB-HCI.
  • ...and 4 more figures