KAM -- a Kernel Attention Module for Emotion Classification with EEG Data

Dongyang Kuang; Craig Michoski

KAM -- a Kernel Attention Module for Emotion Classification with EEG Data

Dongyang Kuang, Craig Michoski

TL;DR

The paper addresses EEG-based emotion classification under data- and parameter-constrained conditions. It introduces Kernel Attention Module (KAM), a parameter-efficient self-attention mechanism that uses a kernel matrix $M_K(x; \theta)$ in place of the standard QKV inner product, yielding $x \leftarrow (I + M_K(x;\theta))x$. Evaluations with EEGNet on the SEED dataset show that KAM improves mean within-subject accuracy by up to about 1% across 15 subjects and provides interpretability through learned kernel weights and a scalar $\alpha$. The work demonstrates that a single extra parameter can achieve both performance gains and model interpretability, with potential for exploring alternative kernels and training strategies.

Abstract

In this work, a kernel attention module is presented for the task of EEG-based emotion classification with neural networks. The proposed module utilizes a self-attention mechanism by performing a kernel trick, demanding significantly fewer trainable parameters and computations than standard attention modules. The design also provides a scalar for quantitatively examining the amount of attention assigned during deep feature refinement, hence help better interpret a trained model. Using EEGNet as the backbone model, extensive experiments are conducted on the SEED dataset to assess the module's performance on within-subject classification tasks compared to other SOTA attention modules. Requiring only one extra parameter, the inserted module is shown to boost the base model's mean prediction accuracy up to more than 1\% across 15 subjects. A key component of the method is the interpretability of solutions, which is addressed using several different techniques, and is included throughout as part of the dependency analysis.

KAM -- a Kernel Attention Module for Emotion Classification with EEG Data

TL;DR

in place of the standard QKV inner product, yielding

. Evaluations with EEGNet on the SEED dataset show that KAM improves mean within-subject accuracy by up to about 1% across 15 subjects and provides interpretability through learned kernel weights and a scalar

. The work demonstrates that a single extra parameter can achieve both performance gains and model interpretability, with potential for exploring alternative kernels and training strategies.

Abstract

Paper Structure (6 sections, 1 equation, 7 figures, 1 table)

This paper contains 6 sections, 1 equation, 7 figures, 1 table.

Introduction
Related Work
Kernel Attention Module
Experiments
Conclusion
Acknowledgement

Figures (7)

Figure 1: The basic self-attention mechanism.
Figure 2: EEGNet with KAM inserted. Some important hyperparameters, kernel shapes and tensor sizes are also shown.
Figure 3: Overall mean prediction performance across 15 subjects.
Figure 4: Kernel weights mapped onto scalp maps. The first row shows the normalized mean. The second row shows the normalized standard deviation from the 5-CV.
Figure 5: $A$: Distribution of learned $\alpha$ value during the 5CV with EEGNet+KAM across the 15 subjects. $B$: Change of accuracy with varying value of $\alpha$ while freezing other parameters in the selected model (marked as red in first column i.e. subject S01 of $A$).
...and 2 more figures

KAM -- a Kernel Attention Module for Emotion Classification with EEG Data

TL;DR

Abstract

KAM -- a Kernel Attention Module for Emotion Classification with EEG Data

Authors

TL;DR

Abstract

Table of Contents

Figures (7)