Table of Contents
Fetching ...

Cueless EEG imagined speech for subject identification: dataset and benchmarks

Ali Derakhshesh, Zahra Dehghanian, Reza Ebrahimpour, Hamid R. Rabiee

TL;DR

This study introduces a cueless EEG-based imagined speech paradigm, where subjects imagine the pronunciation of semantically meaningful words without any external cues, and demonstrates outstanding classification accuracy, reaching 97.93%.

Abstract

Electroencephalogram (EEG) signals have emerged as a promising modality for biometric identification. While previous studies have explored the use of imagined speech with semantically meaningful words for subject identification, most have relied on additional visual or auditory cues. In this study, we introduce a cueless EEG-based imagined speech paradigm, where subjects imagine the pronunciation of semantically meaningful words without any external cues. This innovative approach addresses the limitations of prior methods by requiring subjects to select and imagine words from a predefined list naturally. The dataset comprises over 4,350 trials from 11 subjects across five sessions. We assess a variety of classification methods, including traditional machine learning techniques such as Support Vector Machines (SVM) and XGBoost, as well as time-series foundation models and deep learning architectures specifically designed for EEG classification, such as EEG Conformer and Shallow ConvNet. A session-based hold-out validation strategy was employed to ensure reliable evaluation and prevent data leakage. Our results demonstrate outstanding classification accuracy, reaching 97.93%. These findings highlight the potential of cueless EEG paradigms for secure and reliable subject identification in real-world applications, such as brain-computer interfaces (BCIs).

Cueless EEG imagined speech for subject identification: dataset and benchmarks

TL;DR

This study introduces a cueless EEG-based imagined speech paradigm, where subjects imagine the pronunciation of semantically meaningful words without any external cues, and demonstrates outstanding classification accuracy, reaching 97.93%.

Abstract

Electroencephalogram (EEG) signals have emerged as a promising modality for biometric identification. While previous studies have explored the use of imagined speech with semantically meaningful words for subject identification, most have relied on additional visual or auditory cues. In this study, we introduce a cueless EEG-based imagined speech paradigm, where subjects imagine the pronunciation of semantically meaningful words without any external cues. This innovative approach addresses the limitations of prior methods by requiring subjects to select and imagine words from a predefined list naturally. The dataset comprises over 4,350 trials from 11 subjects across five sessions. We assess a variety of classification methods, including traditional machine learning techniques such as Support Vector Machines (SVM) and XGBoost, as well as time-series foundation models and deep learning architectures specifically designed for EEG classification, such as EEG Conformer and Shallow ConvNet. A session-based hold-out validation strategy was employed to ensure reliable evaluation and prevent data leakage. Our results demonstrate outstanding classification accuracy, reaching 97.93%. These findings highlight the potential of cueless EEG paradigms for secure and reliable subject identification in real-world applications, such as brain-computer interfaces (BCIs).
Paper Structure (27 sections, 5 equations, 6 figures, 7 tables)

This paper contains 27 sections, 5 equations, 6 figures, 7 tables.

Figures (6)

  • Figure 1: Trial paradigm. Times above the arrow shows the intervals between each phase. Red highlighted buttons on keyboard shows the buttons which subject could press.
  • Figure 2: EEG channels used in the experiment, arranged according to the 10-20 system.
  • Figure 3: t-SNE visualizations of the extracted features in a 2-dimensional space for subject identification. From left to right, the subplots illustrate statistical features, wavelet features, and fine-tuned MOMENT-small embeddings, respectively.
  • Figure 4: Performance of Shallow ConvNet and EEG Conformer for subject identification using different amount of sessions.
  • Figure 5: Visualization of zero-shot embeddings for three different MOMENT model sizes: large, base, and small. The embeddings for each model size are shown side-by-side for comparison, highlighting the differences in feature space representation.
  • ...and 1 more figures