Improving SSVEP BCI Spellers With Data Augmentation and Language Models
Joseph Zhang, Ruiming Zhang, Kipngeno Koech, David Hill, Kateryna Shapovalenko
TL;DR
This work tackles the variability and generalization challenges of SSVEP-based BCI spellers by coupling data augmentation with a language-model-based prior. The authors establish a strong EEGNet baseline on the Benchmark dataset and systematically evaluate a suite of augmentation strategies, finding that most hamper performance while time masking can reduce validation loss. A novel hybrid model, EEGNet-CharRNN, integrates a character-level RNN with EEGNet using a tunable weighting parameter to improve letter-level decoding, yielding up to 2.9 percentage-point gains on unseen subjects and about 96% accuracy on the all-subject evaluation at optimum weighting. These results demonstrate the potential for language-model priors to enhance BCI spelling in real-time, especially for subjects whose EEG signals are challenging to classify. The work outlines concrete directions for real-time full-sentence spellers and suggests that larger language models could further boost performance in practical deployments.
Abstract
Steady-State Visual Evoked Potential (SSVEP) spellers are a promising communication tool for individuals with disabilities. This Brain-Computer Interface utilizes scalp potential data from (electroencephalography) EEG electrodes on a subject's head to decode specific letters or arbitrary targets the subject is looking at on a screen. However, deep neural networks for SSVEP spellers often suffer from low accuracy and poor generalizability to unseen subjects, largely due to the high variability in EEG data. In this study, we propose a hybrid approach combining data augmentation and language modeling to enhance the performance of SSVEP spellers. Using the Benchmark dataset from Tsinghua University, we explore various data augmentation techniques, including frequency masking, time masking, and noise injection, to improve the robustness of deep learning models. Additionally, we integrate a language model (CharRNN) with EEGNet to incorporate linguistic context, significantly enhancing word-level decoding accuracy. Our results demonstrate accuracy improvements of up to 2.9 percent over the baseline, with time masking and language modeling showing the most promise. This work paves the way for more accurate and generalizable SSVEP speller systems, offering improved communication solutions for individuals with disabilities.
