Table of Contents
Fetching ...

Evaluation Of P300 Speller Performance Using Large Language Models Along With Cross-Subject Training

Nithin Parthasarathy, James Soetedjo, Saarang Panchavati, Nitya Parthasarathy, Corey Arnold, Nader Pouratian, William Speier

TL;DR

Extensive simulations based on randomly sampled EEG data from subjects show substantial speed improvements in typing passages that include rare and out-of-vocabulary (OOV) words, with the extent of improvement varying depending on the language model utilized.

Abstract

Amyotrophic lateral sclerosis (ALS), a progressive neuromuscular degenerative disease, severely restricts patient communication capacity within a few years of onset, resulting in a significant deterioration of quality of life. The P300 speller brain computer interface (BCI) offers an alternative communication medium by leveraging a subject's EEG response to characters traditionally highlighted on a character grid on a graphical user interface (GUI). A recurring theme in P300-based research is enhancing performance to enable faster subject interaction. This study builds on that theme by addressing key limitations, particularly in the training of multi-subject classifiers, and by integrating advanced language models to optimize stimuli presentation and word prediction, thereby improving communication efficiency. Furthermore, various advanced large language models such as Generative Pre-Trained Transformer (GPT2), BERT, and BART, alongside Dijkstra's algorithm, are utilized to optimize stimuli and provide word completion choices based on the spelling history. In addition, a multi-layered smoothing approach is applied to allow for out-of-vocabulary (OOV) words. By conducting extensive simulations based on randomly sampled EEG data from subjects, we show substantial speed improvements in typing passages that include rare and out-of-vocabulary (OOV) words, with the extent of improvement varying depending on the language model utilized. The gains through such character-level interface optimizations are approximately 10%, and GPT2 for multi-word prediction provides gains of around 40%. In particular, some large language models achieve performance levels within 10% of the theoretical performance limits established in this study. In addition, both within and across subjects, training techniques are explored, and speed improvements are shown to hold in both cases.

Evaluation Of P300 Speller Performance Using Large Language Models Along With Cross-Subject Training

TL;DR

Extensive simulations based on randomly sampled EEG data from subjects show substantial speed improvements in typing passages that include rare and out-of-vocabulary (OOV) words, with the extent of improvement varying depending on the language model utilized.

Abstract

Amyotrophic lateral sclerosis (ALS), a progressive neuromuscular degenerative disease, severely restricts patient communication capacity within a few years of onset, resulting in a significant deterioration of quality of life. The P300 speller brain computer interface (BCI) offers an alternative communication medium by leveraging a subject's EEG response to characters traditionally highlighted on a character grid on a graphical user interface (GUI). A recurring theme in P300-based research is enhancing performance to enable faster subject interaction. This study builds on that theme by addressing key limitations, particularly in the training of multi-subject classifiers, and by integrating advanced language models to optimize stimuli presentation and word prediction, thereby improving communication efficiency. Furthermore, various advanced large language models such as Generative Pre-Trained Transformer (GPT2), BERT, and BART, alongside Dijkstra's algorithm, are utilized to optimize stimuli and provide word completion choices based on the spelling history. In addition, a multi-layered smoothing approach is applied to allow for out-of-vocabulary (OOV) words. By conducting extensive simulations based on randomly sampled EEG data from subjects, we show substantial speed improvements in typing passages that include rare and out-of-vocabulary (OOV) words, with the extent of improvement varying depending on the language model utilized. The gains through such character-level interface optimizations are approximately 10%, and GPT2 for multi-word prediction provides gains of around 40%. In particular, some large language models achieve performance levels within 10% of the theoretical performance limits established in this study. In addition, both within and across subjects, training techniques are explored, and speed improvements are shown to hold in both cases.

Paper Structure

This paper contains 14 sections, 9 equations, 11 figures, 1 table.

Figures (11)

  • Figure 1: A block diagram of the P300 speller based BCI used in this work. Subject EEG response to character highlights on a flashboard are shown.
  • Figure 2: Types of language models used. Starting with the simplest on the left to the most complex on the right
  • Figure 3: BCI display and volunteer EEG recording during stimulus presentation.
  • Figure 7: Layered GPT2 word choice selection to obtain most likely N word completions with fallback to Dijkstra's algorithm
  • Figure 8: Waveform analysis across all subjects in the dataset, subjects for four channels. These are the average ERPs for target (blue) and non-target (red) stimuli, which are averaged across all dataset subjects. Shaded regions represent the standard error of the mean, and vertical gray regions represent significant differences after using false discovery rate to account for multiple comparisons.
  • ...and 6 more figures