Table of Contents
Fetching ...

CBraMod: A Criss-Cross Brain Foundation Model for EEG Decoding

Jiquan Wang, Sha Zhao, Zhiling Luo, Yangxuan Zhou, Haiteng Jiang, Shijian Li, Tao Li, Gang Pan

TL;DR

This work targets the limited generalizability of EEG decoding across diverse BCI tasks and datasets. It introduces CBraMod, an EEG foundation model built on a criss-cross transformer that separately models spatial and temporal dependencies via parallel attention, enhanced by an asymmetric conditional positional encoding. Pretrained with patch-based masked EEG reconstruction on the Temple University EEG Corpus (TUEG), CBraMod demonstrates state-of-the-art performance across 10 downstream BCI tasks over 12 public datasets, highlighting strong generalization. The results suggest that tailored spatial-temporal modeling and dynamic positional encoding can substantially improve EEG representation learning, moving toward universal EEG-BCI systems.

Abstract

Electroencephalography (EEG) is a non-invasive technique to measure and record brain electrical activity, widely used in various BCI and healthcare applications. Early EEG decoding methods rely on supervised learning, limited by specific tasks and datasets, hindering model performance and generalizability. With the success of large language models, there is a growing body of studies focusing on EEG foundation models. However, these studies still leave challenges: Firstly, most of existing EEG foundation models employ full EEG modeling strategy. It models the spatial and temporal dependencies between all EEG patches together, but ignores that the spatial and temporal dependencies are heterogeneous due to the unique structural characteristics of EEG signals. Secondly, existing EEG foundation models have limited generalizability on a wide range of downstream BCI tasks due to varying formats of EEG data, making it challenging to adapt to. To address these challenges, we propose a novel foundation model called CBraMod. Specifically, we devise a criss-cross transformer as the backbone to thoroughly leverage the structural characteristics of EEG signals, which can model spatial and temporal dependencies separately through two parallel attention mechanisms. And we utilize an asymmetric conditional positional encoding scheme which can encode positional information of EEG patches and be easily adapted to the EEG with diverse formats. CBraMod is pre-trained on a very large corpus of EEG through patch-based masked EEG reconstruction. We evaluate CBraMod on up to 10 downstream BCI tasks (12 public datasets). CBraMod achieves the state-of-the-art performance across the wide range of tasks, proving its strong capability and generalizability. The source code is publicly available at https://github.com/wjq-learning/CBraMod.

CBraMod: A Criss-Cross Brain Foundation Model for EEG Decoding

TL;DR

This work targets the limited generalizability of EEG decoding across diverse BCI tasks and datasets. It introduces CBraMod, an EEG foundation model built on a criss-cross transformer that separately models spatial and temporal dependencies via parallel attention, enhanced by an asymmetric conditional positional encoding. Pretrained with patch-based masked EEG reconstruction on the Temple University EEG Corpus (TUEG), CBraMod demonstrates state-of-the-art performance across 10 downstream BCI tasks over 12 public datasets, highlighting strong generalization. The results suggest that tailored spatial-temporal modeling and dynamic positional encoding can substantially improve EEG representation learning, moving toward universal EEG-BCI systems.

Abstract

Electroencephalography (EEG) is a non-invasive technique to measure and record brain electrical activity, widely used in various BCI and healthcare applications. Early EEG decoding methods rely on supervised learning, limited by specific tasks and datasets, hindering model performance and generalizability. With the success of large language models, there is a growing body of studies focusing on EEG foundation models. However, these studies still leave challenges: Firstly, most of existing EEG foundation models employ full EEG modeling strategy. It models the spatial and temporal dependencies between all EEG patches together, but ignores that the spatial and temporal dependencies are heterogeneous due to the unique structural characteristics of EEG signals. Secondly, existing EEG foundation models have limited generalizability on a wide range of downstream BCI tasks due to varying formats of EEG data, making it challenging to adapt to. To address these challenges, we propose a novel foundation model called CBraMod. Specifically, we devise a criss-cross transformer as the backbone to thoroughly leverage the structural characteristics of EEG signals, which can model spatial and temporal dependencies separately through two parallel attention mechanisms. And we utilize an asymmetric conditional positional encoding scheme which can encode positional information of EEG patches and be easily adapted to the EEG with diverse formats. CBraMod is pre-trained on a very large corpus of EEG through patch-based masked EEG reconstruction. We evaluate CBraMod on up to 10 downstream BCI tasks (12 public datasets). CBraMod achieves the state-of-the-art performance across the wide range of tasks, proving its strong capability and generalizability. The source code is publicly available at https://github.com/wjq-learning/CBraMod.

Paper Structure

This paper contains 44 sections, 7 equations, 13 figures, 23 tables.

Figures (13)

  • Figure 1: EEG patches and different EEG modeling strategies.
  • Figure 2: CBraMod pre-training overview.
  • Figure 3: Criss-Cross Transformer Block.
  • Figure 4: The results of attention mechanism comparison.
  • Figure 5: The results of positional encoding comparison.
  • ...and 8 more figures