Table of Contents
Fetching ...

A Conversational Brain-Artificial Intelligence Interface

Anja Meunier, Michal Robert Žák, Lucas Munz, Sofiya Garkot, Manuel Eder, Jiachen Xu, Moritz Grosse-Wentrup

TL;DR

The paper proposes Brain-Artificial Intelligence Interfaces (BAIs) to extend BCIs by delegating parts of cognitive processing to AI, enabling complex tasks for users with cognitive impairments. It introduces EEGChat, a non-invasive Conversational BAI that uses contextual input, GPT-based cognitive probing, and code-VEP EEG decoding to select keywords that guide a fine-tuned language model to generate fluent full-sentence responses. In a simulated phone-conversation study with five healthy participants, EEGChat enabled goal-directed communication, with HQ-tuned sentence generation delivering the most reliable outputs; results indicate potential to broaden BCI usability to aphasia and other language impairments. The work highlights the need for task-specific fine-tuning, robust decoding, and careful ethical considerations as BAIs bridge human intent and AI-generated content in neuroprosthetic settings.

Abstract

We introduce Brain-Artificial Intelligence Interfaces (BAIs) as a new class of Brain-Computer Interfaces (BCIs). Unlike conventional BCIs, which rely on intact cognitive capabilities, BAIs leverage the power of artificial intelligence to replace parts of the neuro-cognitive processing pipeline. BAIs allow users to accomplish complex tasks by providing high-level intentions, while a pre-trained AI agent determines low-level details. This approach enlarges the target audience of BCIs to individuals with cognitive impairments, a population often excluded from the benefits of conventional BCIs. We present the general concept of BAIs and illustrate the potential of this new approach with a Conversational BAI based on EEG. In particular, we show in an experiment with simulated phone conversations that the Conversational BAI enables complex communication without the need to generate language. Our work thus demonstrates, for the first time, the ability of a speech neuroprosthesis to enable fluent communication in realistic scenarios with non-invasive technologies.

A Conversational Brain-Artificial Intelligence Interface

TL;DR

The paper proposes Brain-Artificial Intelligence Interfaces (BAIs) to extend BCIs by delegating parts of cognitive processing to AI, enabling complex tasks for users with cognitive impairments. It introduces EEGChat, a non-invasive Conversational BAI that uses contextual input, GPT-based cognitive probing, and code-VEP EEG decoding to select keywords that guide a fine-tuned language model to generate fluent full-sentence responses. In a simulated phone-conversation study with five healthy participants, EEGChat enabled goal-directed communication, with HQ-tuned sentence generation delivering the most reliable outputs; results indicate potential to broaden BCI usability to aphasia and other language impairments. The work highlights the need for task-specific fine-tuning, robust decoding, and careful ethical considerations as BAIs bridge human intent and AI-generated content in neuroprosthetic settings.

Abstract

We introduce Brain-Artificial Intelligence Interfaces (BAIs) as a new class of Brain-Computer Interfaces (BCIs). Unlike conventional BCIs, which rely on intact cognitive capabilities, BAIs leverage the power of artificial intelligence to replace parts of the neuro-cognitive processing pipeline. BAIs allow users to accomplish complex tasks by providing high-level intentions, while a pre-trained AI agent determines low-level details. This approach enlarges the target audience of BCIs to individuals with cognitive impairments, a population often excluded from the benefits of conventional BCIs. We present the general concept of BAIs and illustrate the potential of this new approach with a Conversational BAI based on EEG. In particular, we show in an experiment with simulated phone conversations that the Conversational BAI enables complex communication without the need to generate language. Our work thus demonstrates, for the first time, the ability of a speech neuroprosthesis to enable fluent communication in realistic scenarios with non-invasive technologies.
Paper Structure (32 sections, 6 figures, 1 table)

This paper contains 32 sections, 6 figures, 1 table.

Figures (6)

  • Figure 1: Structure of a Brain-AI Interface. The BAI first receives contextual information from the environment and then probes the user about their intentions. These are subsequently decoded from neurophysiological data and then translated into action by the BAI.
  • Figure 2: Structure (A) and screenshot (B) of EEGChat, the first Conversational BAI. One question-answer exchange occurs as follows: The conversation partner asks a question, which is automatically transcribed and supplied to both the user and BAI (displayed in the top left box on the screen). The Conversational BAI generates short possible answers and displays them to the user. Additionally, there are four special options for the user to correct mistakes ("Correction"), show more keywords ("More"), choose no keyword ("None"), or end the conversation ("Finished"). After reading the options, the user focuses on their chosen answer during a code-VEP stimulation phase. The choice is decoded from EEG and supplied to the BAI. The BAI then generates an appropriate answer to the question using the conversation history and chosen keyword. This full-sentence answer is displayed on the bottom left of the screen and played as a text-to-speech generated audio. The current scenario goal was displayed at the top of the screen during the experiment.
  • Figure 3: A detailed view of the experiment, consisting of a classifier training and UI familiarization phase, an evaluation scenario phase, and a classifier accuracy evaluation phase. The structure of each scenario is shown on the right.
  • Figure 4: (A) Histogram of the time until a keyword was selected. Due to their low accuracy, S5 often reached the maximum time (10.85 s). (B) Mean and standard deviation of keyword selection time per subject. (C) Percentage of times a keyword at the given position was selected (excluding special options). Keywords 7 - 12 were only shown after selecting "More". (D) Stimulus selection accuracy during the accuracy evaluation stage, per subject. Accuracy was computed with 11 questions for S1 and with 20 questions for S2 - S5. Red line indicates chance level. (E) Results from the post-experiment participant survey.
  • Figure 5: From left to right: Distribution of the performance metrics evaluated by human experiment participants, distribution of the mistakes causing a model to have a lower-than-perfect score, and results of the performance analysis of the different models. The rating was computed from the answer performance (perfect (5) to wrong (1)). For the adjusted rating, the original rating is multiplied by a factor depending on the mistake made (slightly different - 1, too brief/short - 0.9, missed details - $\frac{2}{3}$, added details - 0.5, wrong information - 0.25 and totally different 0.1)
  • ...and 1 more figures