Table of Contents
Fetching ...

CEReBrO: Compact Encoder for Representations of Brain Oscillations Using Efficient Alternating Attention

Alexandru Dimofte, Glenn Anta Bucagu, Thorir Mar Ingolfsson, Xiaying Wang, Andrea Cossettini, Luca Benini, Yawei Li

TL;DR

CEReBrO tackles the challenge of efficient, scalable EEG representation learning by introducing a compact encoder that tokenizes EEG at a per-channel patch level and employs an alternating attention mechanism to jointly model intra-channel temporal dynamics and inter-channel spatial correlations. The approach significantly reduces memory and compute compared to standard self-attention, while pretraining on a large public corpus (>20,000 hours) yields strong performance on emotion, seizure, and anomaly detection benchmarks and competitive results on gait tasks. Key contributions include the novel tokenization, the alternating attention design, and a programmable handling of varying channel counts via padding, enabling robust cross-device applicability and reproducibility on public data. The work demonstrates practical significance for real-time EEG analysis on resource-constrained devices and establishes a foundation for smaller, efficient EEG foundation models with strong downstream transfer.

Abstract

Electroencephalograph (EEG) is a crucial tool for studying brain activity. Recently, self-supervised learning methods leveraging large unlabeled datasets have emerged as a potential solution to the scarcity of widely available annotated EEG data. However, current methods suffer from at least one of the following limitations: i) sub-optimal EEG signal modeling, ii) model sizes in the hundreds of millions of trainable parameters, and iii) reliance on private datasets and/or inconsistent public benchmarks, hindering reproducibility. To address these challenges, we introduce a Compact Encoder for Representations of Brain Oscillations using alternating attention (CEReBrO), a new small EEG foundation model. Our tokenization scheme represents EEG signals at a per-channel patch granularity. We propose an alternating attention mechanism that jointly models intra-channel temporal dynamics and inter-channel spatial correlations, achieving 2x speed improvement with 6x less memory required compared to standard self-attention. We present several model sizes ranging from 3.6 million to 85 million parameters. Pre-trained on over 20,000 hours of publicly available scalp EEG recordings with diverse channel configurations, our models set new benchmarks in emotion detection and seizure detection tasks, with competitive performance in anomaly classification and gait prediction. This validates our models' effectiveness and efficiency.

CEReBrO: Compact Encoder for Representations of Brain Oscillations Using Efficient Alternating Attention

TL;DR

CEReBrO tackles the challenge of efficient, scalable EEG representation learning by introducing a compact encoder that tokenizes EEG at a per-channel patch level and employs an alternating attention mechanism to jointly model intra-channel temporal dynamics and inter-channel spatial correlations. The approach significantly reduces memory and compute compared to standard self-attention, while pretraining on a large public corpus (>20,000 hours) yields strong performance on emotion, seizure, and anomaly detection benchmarks and competitive results on gait tasks. Key contributions include the novel tokenization, the alternating attention design, and a programmable handling of varying channel counts via padding, enabling robust cross-device applicability and reproducibility on public data. The work demonstrates practical significance for real-time EEG analysis on resource-constrained devices and establishes a foundation for smaller, efficient EEG foundation models with strong downstream transfer.

Abstract

Electroencephalograph (EEG) is a crucial tool for studying brain activity. Recently, self-supervised learning methods leveraging large unlabeled datasets have emerged as a potential solution to the scarcity of widely available annotated EEG data. However, current methods suffer from at least one of the following limitations: i) sub-optimal EEG signal modeling, ii) model sizes in the hundreds of millions of trainable parameters, and iii) reliance on private datasets and/or inconsistent public benchmarks, hindering reproducibility. To address these challenges, we introduce a Compact Encoder for Representations of Brain Oscillations using alternating attention (CEReBrO), a new small EEG foundation model. Our tokenization scheme represents EEG signals at a per-channel patch granularity. We propose an alternating attention mechanism that jointly models intra-channel temporal dynamics and inter-channel spatial correlations, achieving 2x speed improvement with 6x less memory required compared to standard self-attention. We present several model sizes ranging from 3.6 million to 85 million parameters. Pre-trained on over 20,000 hours of publicly available scalp EEG recordings with diverse channel configurations, our models set new benchmarks in emotion detection and seizure detection tasks, with competitive performance in anomaly classification and gait prediction. This validates our models' effectiveness and efficiency.
Paper Structure (31 sections, 6 equations, 6 figures, 15 tables, 3 algorithms)

This paper contains 31 sections, 6 equations, 6 figures, 15 tables, 3 algorithms.

Figures (6)

  • Figure 1: Comparison of CEReBrO with current SOTA in a) anomaly classification, b) seizure detection and c) emotion classification.
  • Figure 2: (a) Overview of the CEReBrO architecture. (b) Overview of our pre-training framework.
  • Figure 3: Forward pass GPU Memory Usage (a) and Runtime (b) vs. Sequence Length for alternating attention and standard self-attention in three CEReBrO model sizes. We use $N_p = 20$ and $C \in [1, 64]$ to simulate typical configurations.
  • Figure 4: Spectrogram square patching (left) and time bin patching (right)
  • Figure 5: Channel configurations per .edf File in the TUEG Dataset.
  • ...and 1 more figures