Table of Contents
Fetching ...

Large Brain Model for Learning Generic Representations with Tremendous EEG Data in BCI

Wei-Bang Jiang, Li-Ming Zhao, Bao-Liang Lu

TL;DR

This work tackles EEG data scarcity and heterogeneity by introducing LaBraM, a large-scale, self-supervised EEG foundation model. It employs patch-based EEG processing, a vector-quantized neural spectrum tokenizer, and masked EEG modeling to learn universal representations across diverse datasets and channel configurations. Evaluations on TUAB and TUEV show LaBraM outperforms state-of-the-art methods on abnormal detection and event-type classification, with gains increasing as model size grows. The study demonstrates the feasibility of scalable, cross-task EEG representations and discusses data requirements, design choices, and future directions like multi-modal integration and efficient fine-tuning.

Abstract

The current electroencephalogram (EEG) based deep learning models are typically designed for specific datasets and applications in brain-computer interaction (BCI), limiting the scale of the models and thus diminishing their perceptual capabilities and generalizability. Recently, Large Language Models (LLMs) have achieved unprecedented success in text processing, prompting us to explore the capabilities of Large EEG Models (LEMs). We hope that LEMs can break through the limitations of different task types of EEG datasets, and obtain universal perceptual capabilities of EEG signals through unsupervised pre-training. Then the models can be fine-tuned for different downstream tasks. However, compared to text data, the volume of EEG datasets is generally small and the format varies widely. For example, there can be mismatched numbers of electrodes, unequal length data samples, varied task designs, and low signal-to-noise ratio. To overcome these challenges, we propose a unified foundation model for EEG called Large Brain Model (LaBraM). LaBraM enables cross-dataset learning by segmenting the EEG signals into EEG channel patches. Vector-quantized neural spectrum prediction is used to train a semantically rich neural tokenizer that encodes continuous raw EEG channel patches into compact neural codes. We then pre-train neural Transformers by predicting the original neural codes for the masked EEG channel patches. The LaBraMs were pre-trained on about 2,500 hours of various types of EEG signals from around 20 datasets and validated on multiple different types of downstream tasks. Experiments on abnormal detection, event type classification, emotion recognition, and gait prediction show that our LaBraM outperforms all compared SOTA methods in their respective fields. Our code is available at https://github.com/935963004/LaBraM.

Large Brain Model for Learning Generic Representations with Tremendous EEG Data in BCI

TL;DR

This work tackles EEG data scarcity and heterogeneity by introducing LaBraM, a large-scale, self-supervised EEG foundation model. It employs patch-based EEG processing, a vector-quantized neural spectrum tokenizer, and masked EEG modeling to learn universal representations across diverse datasets and channel configurations. Evaluations on TUAB and TUEV show LaBraM outperforms state-of-the-art methods on abnormal detection and event-type classification, with gains increasing as model size grows. The study demonstrates the feasibility of scalable, cross-task EEG representations and discusses data requirements, design choices, and future directions like multi-modal integration and efficient fine-tuning.

Abstract

The current electroencephalogram (EEG) based deep learning models are typically designed for specific datasets and applications in brain-computer interaction (BCI), limiting the scale of the models and thus diminishing their perceptual capabilities and generalizability. Recently, Large Language Models (LLMs) have achieved unprecedented success in text processing, prompting us to explore the capabilities of Large EEG Models (LEMs). We hope that LEMs can break through the limitations of different task types of EEG datasets, and obtain universal perceptual capabilities of EEG signals through unsupervised pre-training. Then the models can be fine-tuned for different downstream tasks. However, compared to text data, the volume of EEG datasets is generally small and the format varies widely. For example, there can be mismatched numbers of electrodes, unequal length data samples, varied task designs, and low signal-to-noise ratio. To overcome these challenges, we propose a unified foundation model for EEG called Large Brain Model (LaBraM). LaBraM enables cross-dataset learning by segmenting the EEG signals into EEG channel patches. Vector-quantized neural spectrum prediction is used to train a semantically rich neural tokenizer that encodes continuous raw EEG channel patches into compact neural codes. We then pre-train neural Transformers by predicting the original neural codes for the masked EEG channel patches. The LaBraMs were pre-trained on about 2,500 hours of various types of EEG signals from around 20 datasets and validated on multiple different types of downstream tasks. Experiments on abnormal detection, event type classification, emotion recognition, and gait prediction show that our LaBraM outperforms all compared SOTA methods in their respective fields. Our code is available at https://github.com/935963004/LaBraM.
Paper Structure (26 sections, 13 equations, 8 figures, 11 tables)

This paper contains 26 sections, 13 equations, 8 figures, 11 tables.

Figures (8)

  • Figure 1: The overall architecture of LaBraM, i.e., neural Transformer. All input EEG signals will first be segmented into EEG patches through a fixed-length time window, and then a temporal encoder will be applied to each patch to extract temporal features. Afterward, temporal and spatial embeddings are added to the patch features to carry temporal and spatial information. At last, the sequence of embeddings is passed into the Transformer encoder by patch-wise attention to obtain the final output.
  • Figure 2: Overview of neural tokenizer training and LaBraM pre-training. Up: We train a neural tokenizer to discretize EEG signals into discrete neural tokens by reconstructing the Fourier spectrum. Down: During pre-training, part of EEG patches are masked while the objective is to predict masked tokens from visible patches.
  • Figure 3: The pre-training loss curve and masked EEG modeling accuracy curve.
  • Figure 4: A comparison of the model's performance on the TUAB and TUEV datasets when incorporating themselves into the pre-training process or not.
  • Figure 5: A comparison of the performance of the Base model, Large model, and Huge model on the TUAB and TUEV datasets as the pre-training data increases.
  • ...and 3 more figures