miMamba: EEG-based Emotion Recognition with Multi-scale Inverted Mamba Models
Xin Zhou, Dawei Huang, Xiaojing Peng, Lijun Yin
TL;DR
This work tackles EEG-based emotion recognition by proposing MS-iMamba, a dual-module network that fuses multi-scale temporal features (MSTB) with interactive temporal–spatial dynamics (TSFB/iMamba). By employing inverted embedding and a selective spatial state model, the approach captures rich spatiotemporal dependencies without hand-crafted time–frequency features. Empirical results on DEAP, DREAMER, and SEED with only four channels show state-of-the-art or near-top performance across intra- and inter-subject settings, demonstrating strong generalization and data-efficiency. The findings highlight the value of integrated, interaction-focused representations for EEG emotion decoding and point to future work on cross-subject robustness and data scarcity scenarios.
Abstract
EEG-based emotion recognition holds significant potential in the field of brain-computer interfaces. A key challenge lies in extracting discriminative spatiotemporal features from electroencephalogram (EEG) signals. Existing studies often rely on domain-specific time-frequency features and analyze temporal dependencies and spatial characteristics separately, neglecting the interaction between local-global relationships and spatiotemporal dynamics. To address this, we propose a novel network called Multi-Scale Inverted Mamba (MS-iMamba), which consists of Multi-Scale Temporal Blocks (MSTB) and Temporal-Spatial Fusion Blocks (TSFB). Specifically, MSTBs are designed to capture both local details and global temporal dependencies across different scale subsequences. The TSFBs, implemented with an inverted Mamba structure, focus on the interaction between dynamic temporal dependencies and spatial characteristics. The primary advantage of MS-iMamba lies in its ability to leverage reconstructed multi-scale EEG sequences, exploiting the interaction between temporal and spatial features without the need for domain-specific time-frequency feature extraction. Experimental results on the DEAP, DREAMER, and SEED datasets demonstrate that MS-iMamba achieves classification accuracies of 94.86%, 94.94%, and 91.36%, respectively, using only four-channel EEG signals, outperforming state-of-the-art methods.
