Table of Contents
Fetching ...

EEG classification for visual brain decoding with spatio-temporal and transformer based paradigms

Akanksha Sharma, Jyoti Nigam, Abhishek Rathore, Arnav Bhavsar

TL;DR

The paper tackles EEG-based visual brain decoding by proposing two integrated architectures that combine a 1D-CNN feature extractor with either a Bi-LSTM or a Transformer for sequential reasoning. It introduces a window-based classification approach with majority voting to cope with EEG non-stationarity, evaluated on the EEG-ImageNet dataset. Empirically, the CNN-Bi-LSTM framework achieves 71% accuracy, outperforming state-of-the-art methods, while the CNN-Transformer also surpasses many prior Transformer-based approaches, with extensive embedding and brain-mapping analyses supporting interpretability. The work advances multi-class visual EEG classification and demonstrates practical potential for brain-computer interface applications, backed by t-SNE and topographic brain maps that link neural patterns to visual categories.

Abstract

In this work, we delve into the EEG classification task in the domain of visual brain decoding via two frameworks, involving two different learning paradigms. Considering the spatio-temporal nature of EEG data, one of our frameworks is based on a CNN-BiLSTM model. The other involves a CNN-Transformer architecture which inherently involves the more versatile attention based learning paradigm. In both cases, a special 1D-CNN feature extraction module is used to generate the initial embeddings with 1D convolutions in the time and the EEG channel domains. Considering the EEG signals are noisy, non stationary and the discriminative features are even less clear (than in semantically structured data such as text or image), we also follow a window-based classification followed by majority voting during inference, to yield labels at a signal level. To illustrate how brain patterns correlate with different image classes, we visualize t-SNE plots of the BiLSTM embeddings alongside brain activation maps for the top 10 classes. These visualizations provide insightful revelations into the distinct neural signatures associated with each visual category, showcasing the BiLSTM's capability to capture and represent the discriminative brain activity linked to visual stimuli. We demonstrate the performance of our approach on the updated EEG-Imagenet dataset with positive comparisons with state-of-the-art methods.

EEG classification for visual brain decoding with spatio-temporal and transformer based paradigms

TL;DR

The paper tackles EEG-based visual brain decoding by proposing two integrated architectures that combine a 1D-CNN feature extractor with either a Bi-LSTM or a Transformer for sequential reasoning. It introduces a window-based classification approach with majority voting to cope with EEG non-stationarity, evaluated on the EEG-ImageNet dataset. Empirically, the CNN-Bi-LSTM framework achieves 71% accuracy, outperforming state-of-the-art methods, while the CNN-Transformer also surpasses many prior Transformer-based approaches, with extensive embedding and brain-mapping analyses supporting interpretability. The work advances multi-class visual EEG classification and demonstrates practical potential for brain-computer interface applications, backed by t-SNE and topographic brain maps that link neural patterns to visual categories.

Abstract

In this work, we delve into the EEG classification task in the domain of visual brain decoding via two frameworks, involving two different learning paradigms. Considering the spatio-temporal nature of EEG data, one of our frameworks is based on a CNN-BiLSTM model. The other involves a CNN-Transformer architecture which inherently involves the more versatile attention based learning paradigm. In both cases, a special 1D-CNN feature extraction module is used to generate the initial embeddings with 1D convolutions in the time and the EEG channel domains. Considering the EEG signals are noisy, non stationary and the discriminative features are even less clear (than in semantically structured data such as text or image), we also follow a window-based classification followed by majority voting during inference, to yield labels at a signal level. To illustrate how brain patterns correlate with different image classes, we visualize t-SNE plots of the BiLSTM embeddings alongside brain activation maps for the top 10 classes. These visualizations provide insightful revelations into the distinct neural signatures associated with each visual category, showcasing the BiLSTM's capability to capture and represent the discriminative brain activity linked to visual stimuli. We demonstrate the performance of our approach on the updated EEG-Imagenet dataset with positive comparisons with state-of-the-art methods.
Paper Structure (19 sections, 7 figures, 3 tables)

This paper contains 19 sections, 7 figures, 3 tables.

Figures (7)

  • Figure 1: 40 classes present in EEG-ImageNet dataset
  • Figure 2: The architecture diagram shows the following steps: An EEG sample is divided into $11$ segments of $220$ samples each during pre-processing. This is followed by a Feature Extraction block which uses a CNN to capture spatial and temporal information. then a Sequential block, which can be a Bi-LSTM or Transformer, learns sequential relations. Finally, a Classifier block with dense layers to classify the embeddings
  • Figure 3: Architecture of stacked Bi-LSTM in sequential block which is fed with extracted features in both forward and reverse manner in order to learn sequence in both forward and backward direction. Present feature is represented as $x_t$ while previous feature is represented as $x_{t-1}$.
  • Figure 4: Architecture of Transformer for sequential block. It consist of only encoder part with single encoder layer. The extracted features from feature extractor is fed to it to give embedding which are used for classification.
  • Figure 5: Bar graph for accuracy of individual classes
  • ...and 2 more figures