SSTAF: Spatial-Spectral-Temporal Attention Fusion Transformer for Motor Imagery Classification

Ummay Maria Muna; Md. Mehedi Hasan Shawon; Md Jobayer; Sumaiya Akter; Saifur Rahman Sabuj

SSTAF: Spatial-Spectral-Temporal Attention Fusion Transformer for Motor Imagery Classification

Ummay Maria Muna, Md. Mehedi Hasan Shawon, Md Jobayer, Sumaiya Akter, Saifur Rahman Sabuj

TL;DR

The paper tackles cross-subject EEG-based motor imagery classification, a task hindered by non-stationarity and inter-subject variability. It introduces the SSTAF Transformer, which fuses spectral, spatial, and temporal attention with custom time-frequency representations via a multi-channel STFT to capture discriminative patterns. Key contributions include the spectral and spatial attention modules, a two-layer transformer encoder, and validation on EEGMMIDB and BCI IV-2a under LOSO and k-fold schemes, achieving accuracies of $76.83\%$ and $68.30\%$, respectively, along with ablation and visualization analyses. The approach demonstrates improved cross-subject generalization and offers a framework for robust MI decoding with potential impact on neurorehabilitation and assistive technologies, while highlighting dataset size as a current limitation.

Abstract

Brain-computer interfaces (BCI) in electroencephalography (EEG)-based motor imagery classification offer promising solutions in neurorehabilitation and assistive technologies by enabling communication between the brain and external devices. However, the non-stationary nature of EEG signals and significant inter-subject variability cause substantial challenges for developing robust cross-subject classification models. This paper introduces a novel Spatial-Spectral-Temporal Attention Fusion (SSTAF) Transformer specifically designed for upper-limb motor imagery classification. Our architecture consists of a spectral transformer and a spatial transformer, followed by a transformer block and a classifier network. Each module is integrated with attention mechanisms that dynamically attend to the most discriminative patterns across multiple domains, such as spectral frequencies, spatial electrode locations, and temporal dynamics. The short-time Fourier transform is incorporated to extract features in the time-frequency domain to make it easier for the model to obtain a better feature distinction. We evaluated our SSTAF Transformer model on two publicly available datasets, the EEGMMIDB dataset, and BCI Competition IV-2a. SSTAF Transformer achieves an accuracy of 76.83% and 68.30% in the data sets, respectively, outperforms traditional CNN-based architectures and a few existing transformer-based approaches.

SSTAF: Spatial-Spectral-Temporal Attention Fusion Transformer for Motor Imagery Classification

TL;DR

Abstract

SSTAF: Spatial-Spectral-Temporal Attention Fusion Transformer for Motor Imagery Classification

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (11)