Table of Contents
Fetching ...

An Efficient Self-Supervised Framework for Long-Sequence EEG Modeling

Jiazhen Hong, Geoffrey Mackellar, Soheila Ghane

TL;DR

EEGM2 tackles long-sequence EEG modeling by marrying a U-shaped encoder–decoder with Mamba-2 structured state-space blocks to achieve linear $O(T)$ computational complexity while preserving temporal and spectral EEG dynamics through a temporal–spectral reconstruction loss. The method supports flexible downstream evaluation (linear/non-linear probing and fine-tuning) and includes a lightweight EEGM2-L variant for resource-constrained deployment. Across TUAB and Emotiv datasets, EEGM2 achieves state-of-the-art performance for short and long sequences, demonstrates strong cross-subject generalization, and shows transferability across domains, with favorable memory and speed profiles compared with Transformer baselines. This work provides a scalable, efficient backbone for EEG representation learning applicable to real-time, resource-limited brain-computer interface devices.

Abstract

Electroencephalogram (EEG) signals generally exhibit low signal-to-noise ratio (SNR) and high inter-subject variability, making generalization across subjects and domains challenging. Recent advances in deep learning, particularly self-supervised learning with Transformer-based architectures, have shown promise in EEG representation learning. However, their quadratic computational complexity increases memory usage and slows inference, making them inefficient for modeling long-range dependencies. Moreover, most existing approaches emphasize either explicit window segmentation of the temporal signal or spectral-only input embedding while neglecting raw temporal dynamics. In this paper, we propose EEGM2, a self-supervised framework that overcomes these limitations. EEGM2 adopts a U-shaped encoder-decoder architecture integrated with Mamba-2 to achieve linear computational complexity, thereby reducing memory usage and improving inference speed. Meanwhile, the selective information propagation mechanism of Mamba-2 enables the model to effectively capture and preserve long-range dependencies in raw EEG signals, where traditional RNN or CNN architectures often struggle. Moreover, EEGM2 employs a self-supervised pre-training objective that reconstructs raw EEG using a combined L1 and spectral (Fourier-based) loss, enhancing generalization by jointly preserving temporal dynamics and spectral characteristics. Experimental results demonstrate that EEGM2 achieves state-of-the-art performance in both short- and long-sequence modeling and classification. Further evaluations show that EEGM2 consistently outperforms existing models, demonstrating strong generalization across subjects and tasks, as well as transferability across domains. Overall, EEGM2 offers an efficient and scalable solution suitable for deployment on resource-constrained brain-computer interface (BCI) devices.

An Efficient Self-Supervised Framework for Long-Sequence EEG Modeling

TL;DR

EEGM2 tackles long-sequence EEG modeling by marrying a U-shaped encoder–decoder with Mamba-2 structured state-space blocks to achieve linear computational complexity while preserving temporal and spectral EEG dynamics through a temporal–spectral reconstruction loss. The method supports flexible downstream evaluation (linear/non-linear probing and fine-tuning) and includes a lightweight EEGM2-L variant for resource-constrained deployment. Across TUAB and Emotiv datasets, EEGM2 achieves state-of-the-art performance for short and long sequences, demonstrates strong cross-subject generalization, and shows transferability across domains, with favorable memory and speed profiles compared with Transformer baselines. This work provides a scalable, efficient backbone for EEG representation learning applicable to real-time, resource-limited brain-computer interface devices.

Abstract

Electroencephalogram (EEG) signals generally exhibit low signal-to-noise ratio (SNR) and high inter-subject variability, making generalization across subjects and domains challenging. Recent advances in deep learning, particularly self-supervised learning with Transformer-based architectures, have shown promise in EEG representation learning. However, their quadratic computational complexity increases memory usage and slows inference, making them inefficient for modeling long-range dependencies. Moreover, most existing approaches emphasize either explicit window segmentation of the temporal signal or spectral-only input embedding while neglecting raw temporal dynamics. In this paper, we propose EEGM2, a self-supervised framework that overcomes these limitations. EEGM2 adopts a U-shaped encoder-decoder architecture integrated with Mamba-2 to achieve linear computational complexity, thereby reducing memory usage and improving inference speed. Meanwhile, the selective information propagation mechanism of Mamba-2 enables the model to effectively capture and preserve long-range dependencies in raw EEG signals, where traditional RNN or CNN architectures often struggle. Moreover, EEGM2 employs a self-supervised pre-training objective that reconstructs raw EEG using a combined L1 and spectral (Fourier-based) loss, enhancing generalization by jointly preserving temporal dynamics and spectral characteristics. Experimental results demonstrate that EEGM2 achieves state-of-the-art performance in both short- and long-sequence modeling and classification. Further evaluations show that EEGM2 consistently outperforms existing models, demonstrating strong generalization across subjects and tasks, as well as transferability across domains. Overall, EEGM2 offers an efficient and scalable solution suitable for deployment on resource-constrained brain-computer interface (BCI) devices.

Paper Structure

This paper contains 23 sections, 4 equations, 4 figures, 7 tables.

Figures (4)

  • Figure 1: Overview of the EEGM2 framework. (a) Reconstruction-based self-supervised pretraining, where the model learns to reconstruct raw EEG signals using a multi-scale encoder–mediator–decoder architecture supervised by a temporal–spectral loss. No labels are required. (b) Downstream evaluation strategies: linear probing and non-linear probing extract frozen encoder representations (purple star), followed by a logistic regression or MLP classifier, respectively; fine-tuning jointly updates the entire model starting from the red star. The “eye open/close” example represents the binary class labels from the Crowdsourced EEG dataset (Section \ref{['sec:data']}).
  • Figure 2: Comparison of 2D t-SNE projections of (a) raw mean EEG features and (b) EEGM2-learned representations on the Crowdsourced EEG dataset.
  • Figure 3: Memory usage and inference speed across varying sequence lengths.
  • Figure 4: Training time of EEGM2 and its variants across four different settings.