miMamba: EEG-based Emotion Recognition with Multi-scale Inverted Mamba Models

Xin Zhou; Dawei Huang; Xiaojing Peng; Lijun Yin

miMamba: EEG-based Emotion Recognition with Multi-scale Inverted Mamba Models

Xin Zhou, Dawei Huang, Xiaojing Peng, Lijun Yin

TL;DR

This work tackles EEG-based emotion recognition by proposing MS-iMamba, a dual-module network that fuses multi-scale temporal features (MSTB) with interactive temporal–spatial dynamics (TSFB/iMamba). By employing inverted embedding and a selective spatial state model, the approach captures rich spatiotemporal dependencies without hand-crafted time–frequency features. Empirical results on DEAP, DREAMER, and SEED with only four channels show state-of-the-art or near-top performance across intra- and inter-subject settings, demonstrating strong generalization and data-efficiency. The findings highlight the value of integrated, interaction-focused representations for EEG emotion decoding and point to future work on cross-subject robustness and data scarcity scenarios.

Abstract

EEG-based emotion recognition holds significant potential in the field of brain-computer interfaces. A key challenge lies in extracting discriminative spatiotemporal features from electroencephalogram (EEG) signals. Existing studies often rely on domain-specific time-frequency features and analyze temporal dependencies and spatial characteristics separately, neglecting the interaction between local-global relationships and spatiotemporal dynamics. To address this, we propose a novel network called Multi-Scale Inverted Mamba (MS-iMamba), which consists of Multi-Scale Temporal Blocks (MSTB) and Temporal-Spatial Fusion Blocks (TSFB). Specifically, MSTBs are designed to capture both local details and global temporal dependencies across different scale subsequences. The TSFBs, implemented with an inverted Mamba structure, focus on the interaction between dynamic temporal dependencies and spatial characteristics. The primary advantage of MS-iMamba lies in its ability to leverage reconstructed multi-scale EEG sequences, exploiting the interaction between temporal and spatial features without the need for domain-specific time-frequency feature extraction. Experimental results on the DEAP, DREAMER, and SEED datasets demonstrate that MS-iMamba achieves classification accuracies of 94.86%, 94.94%, and 91.36%, respectively, using only four-channel EEG signals, outperforming state-of-the-art methods.

miMamba: EEG-based Emotion Recognition with Multi-scale Inverted Mamba Models

TL;DR

Abstract

Paper Structure (24 sections, 12 equations, 4 figures, 7 tables, 1 algorithm)

This paper contains 24 sections, 12 equations, 4 figures, 7 tables, 1 algorithm.

Introduction
Related Work
Multi-Scale Representation Learning
Spatiotemporal Representation Learning
Method
Notations and Definitions
Multi-Scale Temporal Block (MSTB)
Multi-Scale Representation
Multi-Scale Perception
Temporal-Spatial Fusion Block (TSFB)
Inverted Embedding Representation
iMamba
Experiment and Results Analysis
Datasets
Training Protocol
...and 9 more sections

Figures (4)

Figure 1: Architecture of the MS-iMamba network for EEG emotion recognition. The network comprises two main modules: the Multi-Scale Temporal Block (MSTB) and the Temporal-Spatial Fusion Block (TSFB). The MSTB extracts multi-scale representations by converting the EEG signal into different frequency domain components and reshaping them into 2-D patches. These patches capture both local and global dependencies through convolution operations. The TSFB then integrates temporal and spatial information by embedding multiple time steps of the same channel into tokens, enabling effective feature extraction through the iMamba module, which combines a reversed embedding mechanism with a selective spatial state model (SSM).
Figure 2: Comparison between normal and inverted embedding mechanism. The top part illustrates the conventional embedding approach, where data from different channels at the same time step are mapped into a single token. The bottom part depicts the reversed embedding method, where multiple time steps of the same channel are mapped into a single token.
Figure 3: Performance of different MS-iMamba variants on the DEAP and DREAMER datasets under intra-subject conditions.
Figure 4: Performance of different MS-iMamba variants on the SEED dataset across four session modes under intra-subject conditions.

miMamba: EEG-based Emotion Recognition with Multi-scale Inverted Mamba Models

TL;DR

Abstract

miMamba: EEG-based Emotion Recognition with Multi-scale Inverted Mamba Models

Authors

TL;DR

Abstract

Table of Contents

Figures (4)