Decoding individual words from non-invasive brain recordings across 723 participants
Stéphane d'Ascoli, Corentin Bel, Jérémy Rapin, Hubert Banville, Yohann Benchetrit, Christophe Pallier, Jean-Rémi King
TL;DR
The study tackles decoding individual words from non-invasive brain recordings (EEG/MEG) at scale, using a large, multilingual dataset. It introduces a deep learning pipeline that maps brain activity to semantic word representations via pretrained language-model embeddings, trained with CLIP-style objectives and a deduplicated SigLIP variant. On 723 participants across diverse devices and languages, the approach achieves robust word decoding, with MEG and reading conditions yielding the strongest performance and significant gains from increasing training data and averaging predictions. Analyses reveal that decoded words reflect both semantic content and sublexical cues such as part-of-speech and word length, informing theories of neural language representation and highlighting practical directions for non-invasive brain-to-text interfaces. The work outlines concrete paths and remaining challenges for translating non-invasive word decoding into real-time, natural-language BCIs, while contributing to our understanding of how semantic and syntactic information are represented in the brain.
Abstract
Deep learning has recently enabled the decoding of language from the neural activity of a few participants with electrodes implanted inside their brain. However, reliably decoding words from non-invasive recordings remains an open challenge. To tackle this issue, we introduce a novel deep learning pipeline to decode individual words from non-invasive electro- (EEG) and magneto-encephalography (MEG) signals. We train and evaluate our approach on an unprecedentedly large number of participants (723) exposed to five million words either written or spoken in English, French or Dutch. Our model outperforms existing methods consistently across participants, devices, languages, and tasks, and can decode words absent from the training set. Our analyses highlight the importance of the recording device and experimental protocol: MEG and reading are easier to decode than EEG and listening, respectively, and it is preferable to collect a large amount of data per participant than to repeat stimuli across a large number of participants. Furthermore, decoding performance consistently increases with the amount of (i) data used for training and (ii) data used for averaging during testing. Finally, single-word predictions show that our model effectively relies on word semantics but also captures syntactic and surface properties such as part-of-speech, word length and even individual letters, especially in the reading condition. Overall, our findings delineate the path and remaining challenges towards building non-invasive brain decoders for natural language.
