An Efficient Self-Supervised Framework for Long-Sequence EEG Modeling
Jiazhen Hong, Geoffrey Mackellar, Soheila Ghane
TL;DR
EEGM2 tackles long-sequence EEG modeling by marrying a U-shaped encoder–decoder with Mamba-2 structured state-space blocks to achieve linear $O(T)$ computational complexity while preserving temporal and spectral EEG dynamics through a temporal–spectral reconstruction loss. The method supports flexible downstream evaluation (linear/non-linear probing and fine-tuning) and includes a lightweight EEGM2-L variant for resource-constrained deployment. Across TUAB and Emotiv datasets, EEGM2 achieves state-of-the-art performance for short and long sequences, demonstrates strong cross-subject generalization, and shows transferability across domains, with favorable memory and speed profiles compared with Transformer baselines. This work provides a scalable, efficient backbone for EEG representation learning applicable to real-time, resource-limited brain-computer interface devices.
Abstract
Electroencephalogram (EEG) signals generally exhibit low signal-to-noise ratio (SNR) and high inter-subject variability, making generalization across subjects and domains challenging. Recent advances in deep learning, particularly self-supervised learning with Transformer-based architectures, have shown promise in EEG representation learning. However, their quadratic computational complexity increases memory usage and slows inference, making them inefficient for modeling long-range dependencies. Moreover, most existing approaches emphasize either explicit window segmentation of the temporal signal or spectral-only input embedding while neglecting raw temporal dynamics. In this paper, we propose EEGM2, a self-supervised framework that overcomes these limitations. EEGM2 adopts a U-shaped encoder-decoder architecture integrated with Mamba-2 to achieve linear computational complexity, thereby reducing memory usage and improving inference speed. Meanwhile, the selective information propagation mechanism of Mamba-2 enables the model to effectively capture and preserve long-range dependencies in raw EEG signals, where traditional RNN or CNN architectures often struggle. Moreover, EEGM2 employs a self-supervised pre-training objective that reconstructs raw EEG using a combined L1 and spectral (Fourier-based) loss, enhancing generalization by jointly preserving temporal dynamics and spectral characteristics. Experimental results demonstrate that EEGM2 achieves state-of-the-art performance in both short- and long-sequence modeling and classification. Further evaluations show that EEGM2 consistently outperforms existing models, demonstrating strong generalization across subjects and tasks, as well as transferability across domains. Overall, EEGM2 offers an efficient and scalable solution suitable for deployment on resource-constrained brain-computer interface (BCI) devices.
