Table of Contents
Fetching ...

Laya: A LeJEPA Approach to EEG via Latent Prediction over Reconstruction

Saarang Panchavati, Uddhav Panchavati, Corey Arnold, William Speier

Abstract

Electroencephalography (EEG) is a widely used tool for studying brain function, with applications in clinical neuroscience, diagnosis, and brain-computer interfaces (BCIs). Recent EEG foundation models trained on large unlabeled corpora aim to learn transferable representations, but their effectiveness remains unclear; reported improvements over smaller task-specific models are often modest, sensitive to downstream adaptation and fine-tuning strategies, and limited under linear probing. We hypothesize that one contributing factor is the reliance on signal reconstruction as the primary self-supervised learning (SSL) objective, which biases representations toward high-variance artifacts rather than task-relevant neural structure. To address this limitation, we explore an SSL paradigm based on Joint Embedding Predictive Architectures (JEPA), which learn by predicting latent representations instead of reconstructing raw signals. While earlier JEPA-style methods often rely on additional heuristics to ensure training stability, recent advances such as LeJEPA provide a more principled and stable formulation. We introduce Laya, the first EEG foundation model based on LeJEPA. Across a range of EEG benchmarks, Laya demonstrates improved performance under linear probing compared to reconstruction-based baselines, suggesting that latent predictive objectives offer a promising direction for learning transferable, high-level EEG representations.

Laya: A LeJEPA Approach to EEG via Latent Prediction over Reconstruction

Abstract

Electroencephalography (EEG) is a widely used tool for studying brain function, with applications in clinical neuroscience, diagnosis, and brain-computer interfaces (BCIs). Recent EEG foundation models trained on large unlabeled corpora aim to learn transferable representations, but their effectiveness remains unclear; reported improvements over smaller task-specific models are often modest, sensitive to downstream adaptation and fine-tuning strategies, and limited under linear probing. We hypothesize that one contributing factor is the reliance on signal reconstruction as the primary self-supervised learning (SSL) objective, which biases representations toward high-variance artifacts rather than task-relevant neural structure. To address this limitation, we explore an SSL paradigm based on Joint Embedding Predictive Architectures (JEPA), which learn by predicting latent representations instead of reconstructing raw signals. While earlier JEPA-style methods often rely on additional heuristics to ensure training stability, recent advances such as LeJEPA provide a more principled and stable formulation. We introduce Laya, the first EEG foundation model based on LeJEPA. Across a range of EEG benchmarks, Laya demonstrates improved performance under linear probing compared to reconstruction-based baselines, suggesting that latent predictive objectives offer a promising direction for learning transferable, high-level EEG representations.
Paper Structure (46 sections, 4 equations, 12 figures, 7 tables, 1 algorithm)

This paper contains 46 sections, 4 equations, 12 figures, 7 tables, 1 algorithm.

Figures (12)

  • Figure 1: Laya architecture overview.(Top) Raw EEG ($C \times T$) is processed through a convolutional patch embedder and channel mixer to produce latent brain states $\mathbf{S} \in \mathbb{R}^{B \times N \times D}$, which are then encoded to produce representations $\mathbf{Z}$. (Bottom) During pretraining, the encoder processes both the full sequence and a masked version (contiguous temporal mask $\mathbf{m}$). Solid arrows indicate the prediction path: masked context representations $\mathbf{Z}_{\mathrm{ctx}}$ are projected to $\mathbf{P}_{\mathrm{ctx}}$ and passed through the predictor. Dotted arrows indicate the target path: full representations $\mathbf{Z}$ are projected with stop-gradient to produce targets $\mathbf{T}$. The predictor output is trained to match masked targets via $\mathcal{L}_{\mathrm{MSE}}$. Dashed arrows indicate the regularization path: $\mathbf{Z}$ is mean-pooled to $\mathbf{z}_{\mathrm{cls}}$, projected to $\mathbf{p}_{\mathrm{cls}}$, and regularized via $\mathcal{L}_{\mathrm{SIGReg}}$ to prevent representation collapse. The channel mixer detail (middle left) shows how patches are added to electrode positions and combined with learned queries via cross-attention.
  • Figure 2: Laya is resilient to noise across clinical tasks. Performance shown under combined noise (Gaussian, 1/f, EMG, channel dropout) at varying SNR levels.
  • Figure 3: PCA visualization of patch embeddings on two seizure recordings. We map three principal components to RGB. Top: Laya. Bottom: LaBraM. Red bar indicates seizure. Laya shows a clear representational shift at seizure onset, while LaBraM does not.
  • Figure 4: Topoplot of the channel weights for eight queries on the HBN dataset during movie watching task (DiaryOfAWimpyKid).
  • Figure 5: Topoplot of the channel weights for eight queries on the TUH dataset.
  • ...and 7 more figures