Table of Contents
Fetching ...

Enhancing EEG-to-Text Decoding through Transferable Representations from Pre-trained Contrastive EEG-Text Masked Autoencoder

Jiaqi Wang, Zhenxi Song, Zhengyu Ma, Xipeng Qiu, Min Zhang, Zhiguo Zhang

TL;DR

This work tackles open vocabulary EEG-to-Text decoding by learning transferable representations across EEG and text. It introduces CET-MAE, a pre-trained, multi-stream self-supervised model that jointly performs intra-modality masked reconstruction and cross-modality semantic alignment. It then builds E2T-PTR, a decoding framework that transfers CET-MAE representations and leverages the BART language model to generate text from EEG sequences. Experiments on ZuCo show state-of-the-art performance in BLEU and ROUGE metrics, demonstrating the potential for more powerful brain-computer interfaces.

Abstract

Reconstructing natural language from non-invasive electroencephalography (EEG) holds great promise as a language decoding technology for brain-computer interfaces (BCIs). However, EEG-based language decoding is still in its nascent stages, facing several technical issues such as: 1) Absence of a hybrid strategy that can effectively integrate cross-modality (between EEG and text) self-learning with intra-modality self-reconstruction of EEG features or textual sequences; 2) Under-utilization of large language models (LLMs) to enhance EEG-based language decoding. To address above issues, we propose the Contrastive EEG-Text Masked Autoencoder (CET-MAE), a novel model that orchestrates compound self-supervised learning across and within EEG and text through a dedicated multi-stream encoder. Furthermore, we develop a framework called E2T-PTR (EEG-to-Text decoding using Pretrained Transferable Representations), which leverages pre-trained modules alongside the EEG stream from CET-MAE and further enables an LLM (specifically BART) to decode text from EEG sequences. Comprehensive experiments conducted on the popular text-evoked EEG database, ZuCo, demonstrate the superiority of E2T-PTR, which outperforms the state-of-the-art in ROUGE-1 F1 and BLEU-4 scores by 8.34% and 32.21%, respectively. These results indicate significant advancements in the field and underscores the proposed framework's potential to enable more powerful and widespread BCI applications.

Enhancing EEG-to-Text Decoding through Transferable Representations from Pre-trained Contrastive EEG-Text Masked Autoencoder

TL;DR

This work tackles open vocabulary EEG-to-Text decoding by learning transferable representations across EEG and text. It introduces CET-MAE, a pre-trained, multi-stream self-supervised model that jointly performs intra-modality masked reconstruction and cross-modality semantic alignment. It then builds E2T-PTR, a decoding framework that transfers CET-MAE representations and leverages the BART language model to generate text from EEG sequences. Experiments on ZuCo show state-of-the-art performance in BLEU and ROUGE metrics, demonstrating the potential for more powerful brain-computer interfaces.

Abstract

Reconstructing natural language from non-invasive electroencephalography (EEG) holds great promise as a language decoding technology for brain-computer interfaces (BCIs). However, EEG-based language decoding is still in its nascent stages, facing several technical issues such as: 1) Absence of a hybrid strategy that can effectively integrate cross-modality (between EEG and text) self-learning with intra-modality self-reconstruction of EEG features or textual sequences; 2) Under-utilization of large language models (LLMs) to enhance EEG-based language decoding. To address above issues, we propose the Contrastive EEG-Text Masked Autoencoder (CET-MAE), a novel model that orchestrates compound self-supervised learning across and within EEG and text through a dedicated multi-stream encoder. Furthermore, we develop a framework called E2T-PTR (EEG-to-Text decoding using Pretrained Transferable Representations), which leverages pre-trained modules alongside the EEG stream from CET-MAE and further enables an LLM (specifically BART) to decode text from EEG sequences. Comprehensive experiments conducted on the popular text-evoked EEG database, ZuCo, demonstrate the superiority of E2T-PTR, which outperforms the state-of-the-art in ROUGE-1 F1 and BLEU-4 scores by 8.34% and 32.21%, respectively. These results indicate significant advancements in the field and underscores the proposed framework's potential to enable more powerful and widespread BCI applications.
Paper Structure (24 sections, 2 equations, 4 figures, 16 tables)

This paper contains 24 sections, 2 equations, 4 figures, 16 tables.

Figures (4)

  • Figure 1: Text-evoked EEG Recording in ZuCo datasets. Participants' EEG and eye-tracking data are simultaneously recorded during natural reading to capture text-evoked brain activity.
  • Figure 2: Illustration of the proposed EEG-text pre-training model (CET-MAE) and EEG-to-Text decoding framework (E2T-PTR). (a) CET-MAE Model: CET-MAE features modality-specific autoencoders with a masking strategy for text and EEG features, complemented by a multi-stream transformer encoder that orchestrates self-reconstruction and cross-modality semantic alignment, enhancing representation learning for EEG semantic decoding. (b) E2T-PTR Framework: E2T-PTR transfers both word- and sentence-level EEG representations extracted from CET-MAE's pre-trained modules, further facilitating text generation through the BART.
  • Figure 3: The radar chart of 18 subjects from Subject YAG to YSD on each metric.
  • Figure 4: The radar chart of 12 subjects from Subject ZKW-ZJS on each metric.