Table of Contents
Fetching ...

Neuro-GPT: Towards A Foundation Model for EEG

Wenhui Cui, Woojae Jeong, Philipp Thölke, Takfarinas Medani, Karim Jerbi, Anand A. Joshi, Richard M. Leahy

TL;DR

This work tackles data scarcity and variability in EEG-based BCI tasks by introducing Neuro-GPT, a foundation model that combines an EEG encoder with a GPT decoder. The model is pre-trained with a self-supervised masked-chunk reconstruction objective on the large TUH EEG dataset and then fine-tuned on a low-data, 9-subject motor imagery task, demonstrating superior cross-subject performance when using the encoder in a fine-tuning setup. Results show that pre-training substantially improves classification accuracy, with Encoder-only fine-tuning delivering the strongest gains and suggesting that the EEG encoder learns transferable, task-relevant features. The study highlights the potential of EEG foundation models to address data scarcity and heterogeneity, and provides insights into effective pre-training and fine-tuning configurations.

Abstract

To handle the scarcity and heterogeneity of electroencephalography (EEG) data for Brain-Computer Interface (BCI) tasks, and to harness the power of large publicly available data sets, we propose Neuro-GPT, a foundation model consisting of an EEG encoder and a GPT model. The foundation model is pre-trained on a large-scale data set using a self-supervised task that learns how to reconstruct masked EEG segments. We then fine-tune the model on a Motor Imagery Classification task to validate its performance in a low-data regime (9 subjects). Our experiments demonstrate that applying a foundation model can significantly improve classification performance compared to a model trained from scratch, which provides evidence for the generalizability of the foundation model and its ability to address challenges of data scarcity and heterogeneity in EEG. The code is publicly available at github.com/wenhui0206/NeuroGPT.

Neuro-GPT: Towards A Foundation Model for EEG

TL;DR

This work tackles data scarcity and variability in EEG-based BCI tasks by introducing Neuro-GPT, a foundation model that combines an EEG encoder with a GPT decoder. The model is pre-trained with a self-supervised masked-chunk reconstruction objective on the large TUH EEG dataset and then fine-tuned on a low-data, 9-subject motor imagery task, demonstrating superior cross-subject performance when using the encoder in a fine-tuning setup. Results show that pre-training substantially improves classification accuracy, with Encoder-only fine-tuning delivering the strongest gains and suggesting that the EEG encoder learns transferable, task-relevant features. The study highlights the potential of EEG foundation models to address data scarcity and heterogeneity, and provides insights into effective pre-training and fine-tuning configurations.

Abstract

To handle the scarcity and heterogeneity of electroencephalography (EEG) data for Brain-Computer Interface (BCI) tasks, and to harness the power of large publicly available data sets, we propose Neuro-GPT, a foundation model consisting of an EEG encoder and a GPT model. The foundation model is pre-trained on a large-scale data set using a self-supervised task that learns how to reconstruct masked EEG segments. We then fine-tune the model on a Motor Imagery Classification task to validate its performance in a low-data regime (9 subjects). Our experiments demonstrate that applying a foundation model can significantly improve classification performance compared to a model trained from scratch, which provides evidence for the generalizability of the foundation model and its ability to address challenges of data scarcity and heterogeneity in EEG. The code is publicly available at github.com/wenhui0206/NeuroGPT.
Paper Structure (8 sections, 2 equations, 2 figures, 1 table)

This paper contains 8 sections, 2 equations, 2 figures, 1 table.

Figures (2)

  • Figure 1: Neuro-GPT Pipeline: the EEG encoder takes chunks of EEG data as input and generates embeddings as tokens for the GPT model. The last embedded chunk in the sequence is masked. The GPT model then predicts the masked chunk and a reconstruction loss is computed between the prediction and the original embedding token.
  • Figure 2: Causal masking: consider a sequence with four tokens (chunks). We duplicate the sequence three times and progressively mask (represented in orange) one token within each duplicated sequence.