Table of Contents
Fetching ...

EEG Foundation Models: A Critical Review of Current Progress and Future Directions

Gayal Kuruppu, Neeraj Wagh, Vaclav Kremen, Sandipan Pati, Gregory Worrell, Yogatheesan Varatharajah

TL;DR

This topical review critically assesses ten early EEG foundation models (EEG-FMs) to illuminate design choices in input representations, self-supervised pretraining, and evaluation strategies. It finds a common reliance on sequence-based transformers with masked reconstruction, but highlights substantial heterogeneity in evaluation and limited evidence for scalable gains. The authors argue for standardized benchmarks, broader data diversity, trustworthy SSL practices, and closer collaboration with domain experts to advance translational EEG-FM research. Collectively, they advocate a concerted effort in benchmarks, software tooling, and practical evaluations to accelerate real-world adoption in research, BCI, and clinical decision support.

Abstract

Premise. Patterns of electrical brain activity recorded via electroencephalography (EEG) offer immense value for scientific and clinical investigations. The inability of supervised EEG encoders to learn robust EEG patterns and their over-reliance on expensive signal annotations have sparked a transition towards general-purpose self-supervised EEG encoders, i.e., EEG foundation models (EEG-FMs), for robust and scalable EEG feature extraction. However, the real-world readiness of early EEG-FMs and the rubrics for long-term research progress remain unclear. Objective. In this work, we conduct a review of ten early EEG-FMs to capture common trends and identify key directions for future development of EEG-FMs. Methods. We comparatively analyze each EEG-FM using three fundamental pillars of foundation modeling, namely the representation of input data, self-supervised modeling, and the evaluation strategy. Based on this analysis, we present a critical synthesis of EEG-FM methodology, empirical findings, and outstanding research gaps. Results. We find that most EEG-FMs adopt a sequence-based modeling scheme that relies on transformer-based backbones and the reconstruction of masked temporal EEG sequences for self-supervision. However, model evaluations remain heterogeneous and largely limited, making it challenging to assess their practical off-the-shelf utility. In addition to adopting standardized and realistic evaluations, future work should demonstrate more substantial scaling effects and make principled and trustworthy choices throughout the EEG representation learning pipeline. Significance. Our review indicates that the development of benchmarks, software tools, technical methodologies, and applications in collaboration with domain experts may advance the translational utility and real-world adoption of EEG-FMs.

EEG Foundation Models: A Critical Review of Current Progress and Future Directions

TL;DR

This topical review critically assesses ten early EEG foundation models (EEG-FMs) to illuminate design choices in input representations, self-supervised pretraining, and evaluation strategies. It finds a common reliance on sequence-based transformers with masked reconstruction, but highlights substantial heterogeneity in evaluation and limited evidence for scalable gains. The authors argue for standardized benchmarks, broader data diversity, trustworthy SSL practices, and closer collaboration with domain experts to advance translational EEG-FM research. Collectively, they advocate a concerted effort in benchmarks, software tooling, and practical evaluations to accelerate real-world adoption in research, BCI, and clinical decision support.

Abstract

Premise. Patterns of electrical brain activity recorded via electroencephalography (EEG) offer immense value for scientific and clinical investigations. The inability of supervised EEG encoders to learn robust EEG patterns and their over-reliance on expensive signal annotations have sparked a transition towards general-purpose self-supervised EEG encoders, i.e., EEG foundation models (EEG-FMs), for robust and scalable EEG feature extraction. However, the real-world readiness of early EEG-FMs and the rubrics for long-term research progress remain unclear. Objective. In this work, we conduct a review of ten early EEG-FMs to capture common trends and identify key directions for future development of EEG-FMs. Methods. We comparatively analyze each EEG-FM using three fundamental pillars of foundation modeling, namely the representation of input data, self-supervised modeling, and the evaluation strategy. Based on this analysis, we present a critical synthesis of EEG-FM methodology, empirical findings, and outstanding research gaps. Results. We find that most EEG-FMs adopt a sequence-based modeling scheme that relies on transformer-based backbones and the reconstruction of masked temporal EEG sequences for self-supervision. However, model evaluations remain heterogeneous and largely limited, making it challenging to assess their practical off-the-shelf utility. In addition to adopting standardized and realistic evaluations, future work should demonstrate more substantial scaling effects and make principled and trustworthy choices throughout the EEG representation learning pipeline. Significance. Our review indicates that the development of benchmarks, software tools, technical methodologies, and applications in collaboration with domain experts may advance the translational utility and real-world adoption of EEG-FMs.

Paper Structure

This paper contains 24 sections, 5 figures, 4 tables.

Figures (5)

  • Figure 1: Electroencephalography foundation model (EEG-FM) publication trends. The total annual Google Scholar search results for the term "EEG Foundation Model" between 2021 and 2024 (September) are shown in blue. The specific EEG-FMs reviewed in this study (i.e., according to the search criteria described in section \ref{['sec:method']}) are listed in red, in chronological order by their preprint publication dates.
  • Figure 2: Comparative analysis of EEG foundation models (EEG-FMs). Our review analyzes EEG-FMs along three major dimensions; input data configuration, modeling, and evaluation (top figure). A summary of the various approaches undertaken by the EEG-FMs to address those components is shown in the bottom figure. EEG data is represented in one of three forms: raw time series, magnitude power spectrum, and time-frequency representation. Model architecture may include convolutional blocks to learn low-level patterns and/or transformer blocks to learn higher-level relationships. Models are pretrained primarily using self-supervised learning (SSL) approaches; the common SSL approaches used are masked reconstruction, auto-regressive modeling, and contrastive learning. The pretrained models are then evaluated on various downstream tasks, including clinical and non-clinical tasks.
  • Figure 3: Performances on common tasks. Here we compare EEG foundation models based on their performance on the common Temple University EEG Corpus tasks -- TUAB (abnormal EEG classification) and TUEV (event classification) -- along with the size of the dataset used for pretraining. The performance of FoME in \ref{['fig:tuev']} is shown as a line because the pretraining data size was unavailable. Channel-hours are on a $10^{3}$ scale. All scores represent fine-tuned model performance, except for the triangular markers, which represent linear-probed model performance.
  • Figure 4: The impact of learning paradigm and model scaling on task performance. In \ref{['fig:feature_paradigm']}, we compare the impact of learning paradigms -- feature-based statistical machine learning (ML), supervised deep learning, and self-supervised pretraining (proposed EEG-FMs and other baselines) -- on task performance. For each model, a specific task is represented using a unique color, and finetuning and linear-probing evaluations are represented using $\bigcirc$ and $\triangle$, respectively. Note that downstream tasks and the metrics differ across models. In \ref{['fig:model_scaling']}, we analyze the impact of model scaling on task performance. Although model sizes are specific to each study, we use 'Sm.', 'Md.', 'Lg.', to represent the smallest, intermediate, and the largest variants, respectively. Within each model, a specific task is represented using a unique color.
  • Figure 5: Suggested future directions. (I) Benchmarks and tools: future EEG foundation models can be compared using standardized benchmarks against prevailing feature paradigms and participate in community-specific EEG challenges to establish their real-world utility. Frictionless and user-friendly software tools are needed to quickly adopt and experiment with off-the-shelf models. (II) Technical modeling: holistic evaluation frameworks that test embedding space semantics, robustness, and transfer efficiency can meaningfully track the state of the art. Advanced representation learning techniques, such as federated or multi-modal learning, can enhance large-scale pretraining. (III) Applications: collaborations with domain experts can inspire novel applications. Strategies that help identify suitable off-the-shelf models for a particular task and address translational hurdles, such as clinical interpretability, prospective validation, and operational feasibility, can increase adoption and impact.