Table of Contents
Fetching ...

Efficient Transformer-Integrated Deep Neural Architectures for Robust EEG Decoding of Complex Visual Imagery

Byoung-Hee Kwon

TL;DR

Problem: decoding complex visual imagery from non-invasive EEG for BCI applications is challenged by spatial variability and limited data. Approach: the authors propose a PLV-guided functional connectivity neural network (FCDN) that combines a CNN-based temporal encoder with a DeiT transformer for spatial feature learning, leveraging delta, theta, and alpha bands and a distillation-based DeiT for data efficiency. Contributions: offline analyses show an average accuracy of 0.7234 across 15 subjects, LOSO cross-validation yields 0.4960, and pseudo-online decoding surpasses 0.75, with the functional connectivity block enhancing spatial discriminability. Significance: the method demonstrates robust, subject-independent EEG-based control suitable for EEG-driven robotic arms, with a validated 3D-BCI training setup supporting near real-time decoding and broader applicability.

Abstract

This study introduces a pioneering approach in brain-computer interface (BCI) technology, featuring our novel concept of complex visual imagery for non-invasive electroencephalography (EEG)-based communication. Complex visual imagery, as proposed in our work, involves the user engaging in the mental visualization of complex upper limb movements. This innovative approach significantly enhances the BCI system, facilitating the extension of its applications to more sophisticated tasks such as EEG-based robotic arm control. By leveraging this advanced form of visual imagery, our study opens new horizons for intricate and intuitive mind-controlled interfaces. We developed an advanced deep learning architecture that integrates functional connectivity metrics with a convolutional neural network-image transformer. This framework is adept at decoding subtle user intentions, addressing the spatial variability in complex visual tasks, and effectively translating these into precise commands for robotic arm control. Our comprehensive offline and pseudo-online evaluations demonstrate the framework's efficacy in real-time applications, including the nuanced control of robotic arms. The robustness of our approach is further validated through leave-one-subject-out cross-validation, marking a significant step towards versatile, subject-independent BCI applications. This research highlights the transformative impact of advanced visual imagery and deep learning in enhancing the usability and adaptability of BCI systems, particularly in robotic arm manipulation.

Efficient Transformer-Integrated Deep Neural Architectures for Robust EEG Decoding of Complex Visual Imagery

TL;DR

Problem: decoding complex visual imagery from non-invasive EEG for BCI applications is challenged by spatial variability and limited data. Approach: the authors propose a PLV-guided functional connectivity neural network (FCDN) that combines a CNN-based temporal encoder with a DeiT transformer for spatial feature learning, leveraging delta, theta, and alpha bands and a distillation-based DeiT for data efficiency. Contributions: offline analyses show an average accuracy of 0.7234 across 15 subjects, LOSO cross-validation yields 0.4960, and pseudo-online decoding surpasses 0.75, with the functional connectivity block enhancing spatial discriminability. Significance: the method demonstrates robust, subject-independent EEG-based control suitable for EEG-driven robotic arms, with a validated 3D-BCI training setup supporting near real-time decoding and broader applicability.

Abstract

This study introduces a pioneering approach in brain-computer interface (BCI) technology, featuring our novel concept of complex visual imagery for non-invasive electroencephalography (EEG)-based communication. Complex visual imagery, as proposed in our work, involves the user engaging in the mental visualization of complex upper limb movements. This innovative approach significantly enhances the BCI system, facilitating the extension of its applications to more sophisticated tasks such as EEG-based robotic arm control. By leveraging this advanced form of visual imagery, our study opens new horizons for intricate and intuitive mind-controlled interfaces. We developed an advanced deep learning architecture that integrates functional connectivity metrics with a convolutional neural network-image transformer. This framework is adept at decoding subtle user intentions, addressing the spatial variability in complex visual tasks, and effectively translating these into precise commands for robotic arm control. Our comprehensive offline and pseudo-online evaluations demonstrate the framework's efficacy in real-time applications, including the nuanced control of robotic arms. The robustness of our approach is further validated through leave-one-subject-out cross-validation, marking a significant step towards versatile, subject-independent BCI applications. This research highlights the transformative impact of advanced visual imagery and deep learning in enhancing the usability and adaptability of BCI systems, particularly in robotic arm manipulation.

Paper Structure

This paper contains 15 sections, 8 equations, 8 figures, 6 tables.

Figures (8)

  • Figure 1: Experimental protocols for complex visual imagery from electroencephalography (EEG) signals. (a) Experimental environment for acquiring complex visual imagery data. (b) Experimental paradigm in a single trial and representation of visual cues according to each task. (c) The type of stimuli given to users in the experiment: picking up a cell phone (class-1), pouring water (class-2), opening a door (class-3), and eating food (class-4).
  • Figure 2: The connections between the channels that have functional connectivity scores above 0.9 when the high-performance group (Sub14, Sub09, and Sub04) and low-performance group (Sub01, Sub08, and Sub13) performed a complex visual imagery task. Functional connectivity was assessed in the delta and alpha frequency ranges. In the delta band, connections are mainly represented in the prefrontal area as shown in the high-performance group. On the other hand, the low-performance group's connections are irregular. In the alpha band, the third row, the high-performance group's connections are mainly represented in the occipital area. However, the low-performance group's connections show irregular tendencies.
  • Figure 3: The overview of the proposed architecture begins with the segmentation of the raw electroencephalogram (EEG) into delta waves, theta waves, and alpha waves. In the functional connectivity layer, each of these EEG wave types is then multiplied channel-wise using a functional connectivity score measured by the phase-locking value. The modified EEG data that emphasize spatial information is used as the input data for the deep learning architecture. We extracted temporal information using a network consisting of a convolutional block, followed by spatial information via a transformer block.
  • Figure 4: In the functional connectivity layer, the raw electroencephalogram (EEG) data undergo transformation through the phase locking value (PLV). Each channel's connectivity score, derived via the PLV, is then normalized. The subsequent integration with the original EEG accentuates spatial details, producing an EEG output enriched with heightened spatial significance.
  • Figure 5: The power spectral changes in each selected channel. The yellow box in the graph indicates the frequency range containing a significant peak. The peaks were observed in channels (a) Fz and (b) Oz related to visual imagery. The time-frequency analysis (c) presents the distinct characteristics of complex visual imagery. Prominent features are evident in the alpha band at the Fz and Oz locations, which are known to be associated with visual imagery.
  • ...and 3 more figures