Table of Contents
Fetching ...

Brain-to-Text Decoding: A Non-invasive Approach via Typing

Jarod Lévy, Mingfang Zhang, Svetlana Pinet, Jérémy Rapin, Hubert Banville, Stéphane d'Ascoli, Jean-Rémi King

TL;DR

This work addresses safe, non-invasive brain–computer interfaces for communication by decoding language production from MEG/EEG while users type memorized sentences. It introduces Brain2Qwerty, a three-stage neural pipeline (Convolutional Module, Transformer, and a pretrained 9-gram language model) that maps 0.5 s windows of brain activity to 29 keyboard keys. Across 35 participants, MEG achieves a mean CER of $32\%$ (best $19\%$) and EEG $67\%$, with some sentences decoded perfectly, indicating substantial progress toward non-invasive, rapid brain-to-text communication; the approach markedly outperforms baselines and its ablations show the additive value of each component. While not yet real-time or clinically ready, this work narrows the gap to invasive BCIs and sets the stage for future real-time, imagination-based, and wearable-sensor developments that could benefit non-communicating patients.

Abstract

Modern neuroprostheses can now restore communication in patients who have lost the ability to speak or move. However, these invasive devices entail risks inherent to neurosurgery. Here, we introduce a non-invasive method to decode the production of sentences from brain activity and demonstrate its efficacy in a cohort of 35 healthy volunteers. For this, we present Brain2Qwerty, a new deep learning architecture trained to decode sentences from either electro- (EEG) or magneto-encephalography (MEG), while participants typed briefly memorized sentences on a QWERTY keyboard. With MEG, Brain2Qwerty reaches, on average, a character-error-rate (CER) of 32% and substantially outperforms EEG (CER: 67%). For the best participants, the model achieves a CER of 19%, and can perfectly decode a variety of sentences outside of the training set. While error analyses suggest that decoding depends on motor processes, the analysis of typographical errors suggests that it also involves higher-level cognitive factors. Overall, these results narrow the gap between invasive and non-invasive methods and thus open the path for developing safe brain-computer interfaces for non-communicating patients.

Brain-to-Text Decoding: A Non-invasive Approach via Typing

TL;DR

This work addresses safe, non-invasive brain–computer interfaces for communication by decoding language production from MEG/EEG while users type memorized sentences. It introduces Brain2Qwerty, a three-stage neural pipeline (Convolutional Module, Transformer, and a pretrained 9-gram language model) that maps 0.5 s windows of brain activity to 29 keyboard keys. Across 35 participants, MEG achieves a mean CER of (best ) and EEG , with some sentences decoded perfectly, indicating substantial progress toward non-invasive, rapid brain-to-text communication; the approach markedly outperforms baselines and its ablations show the additive value of each component. While not yet real-time or clinically ready, this work narrows the gap to invasive BCIs and sets the stage for future real-time, imagination-based, and wearable-sensor developments that could benefit non-communicating patients.

Abstract

Modern neuroprostheses can now restore communication in patients who have lost the ability to speak or move. However, these invasive devices entail risks inherent to neurosurgery. Here, we introduce a non-invasive method to decode the production of sentences from brain activity and demonstrate its efficacy in a cohort of 35 healthy volunteers. For this, we present Brain2Qwerty, a new deep learning architecture trained to decode sentences from either electro- (EEG) or magneto-encephalography (MEG), while participants typed briefly memorized sentences on a QWERTY keyboard. With MEG, Brain2Qwerty reaches, on average, a character-error-rate (CER) of 32% and substantially outperforms EEG (CER: 67%). For the best participants, the model achieves a CER of 19%, and can perfectly decode a variety of sentences outside of the training set. While error analyses suggest that decoding depends on motor processes, the analysis of typographical errors suggests that it also involves higher-level cognitive factors. Overall, these results narrow the gap between invasive and non-invasive methods and thus open the path for developing safe brain-computer interfaces for non-communicating patients.

Paper Structure

This paper contains 34 sections, 1 equation, 5 figures, 2 tables.

Figures (5)

  • Figure 1: Approach. Recordings from 35 participants were obtained using electro-encephalography (EEG) and magneto-encephalography (MEG). Sentences were displayed word-by-word on a screen. Following the final word, a visual cue prompted them to begin typing this sentence, without visual feedback. Our Brain2Qwerty model includes three core stages to decode text from brain activity: (1) a convolutional module, input with 500 ms windows of M/EEG signals, (2) a transformer module trained at the sentence level, and (3) a pretrained language model to correct the outputs of the transformer. Performance is assessed using a Character Error Rate (CER) at the sentence level. An analysis of how the brain performs typing is described in a companion paper lucy2025.
  • Figure 2: Decoding Performance across models. A. Difference in EEG evoked responses between left vs right hand key presses. Each black line is the differential voltage of a sensor relative to key press. B. Same as A but for MEG. C. Linear classifiers are trained, at each time sample, to predict the left vs right hand relative to each key press. The gray line represents chance level and the error bar is the standard error of the mean across participants. Significant decoding scores (p < 0.05) are marked with a star. D. Same as C but for character classification. E-H. Comparison of baselines (linear and EEGNet), and ablation of our three-step Brain2Qwerty model (Conv+Trans+Language Model), for both hand-error-rate (HER) and character-error-rate (CER). Each point represents the average score of a single participant. Statistical significance is denoted with p < 0.05 (*), p < 0.01 (**), and p < 0.001 (***).
  • Figure 3: Sentence-level performance for Best, Median and Worst MEG subjects. A. Character-error-rate for three representative subjects. Each dot represents a unique sentence, with error bars indicating the standard error of the mean across repetitions. White dots corresponds to the sentences displayed below. B. Decoding predictions for two sentences. Several splitting seeds were used to obtain the predictions across sentences.
  • Figure 4: Analysis of character- and word-level performance. The results presented are specific to MEG data processed using the Conv+Trans model. A. Character-error-rate (CER) is evaluated across different part-of-speech categories to evaluate how performance varies across adjectives (ADJ), nouns, verbs, determiners (DET), and prepositions (ADP). B. CER as a function of word frequency. Out-of-vocabulary (OOV) decoding is used to test whether Brain2Qwerty can decode words absent from the training set. C. CER as a function of character frequency. D. CER as a function of recording time included in the training set.
  • Figure 5: Impact of keyboard layout and typing errors. The results presented are specific to MEG data processed using the Conv+Trans model. A. Keyboard Distance Effect. Confusion rate is analyzed against normalized keyboard distance. B. Clustering Analysis. K-means clustering of the model embeddings with 2 (top) and 10 (bottom) clusters respectively. C. Keypress Intervals Analysis. Comparison of keypress interval for correct keystrokes and typing errors, focusing on both preceding and subsequent characters. The sum of the two intervals is displayed. D. Typing Mistakes Performance Differences. Performance comparison for correct characters versus typing errors using the Conv+Trans model (left) and the Conv model (right).