Advancing EEG-Based Gaze Prediction Using Depthwise Separable Convolution and Enhanced Pre-Processing

Matthew L Key; Tural Mehtiyev; Xiaodong Qu

Advancing EEG-Based Gaze Prediction Using Depthwise Separable Convolution and Enhanced Pre-Processing

Matthew L Key, Tural Mehtiyev, Xiaodong Qu

TL;DR

The paper addresses the challenge of accurately predicting gaze position from EEG data by leveraging a hybrid EEGViT architecture enhanced with depthwise separable convolutions and a clustering-based pre-processing pipeline. The proposed EEG-DCViT combines DS-CNNs with data clustering to improve feature extraction and label fidelity, yielding superior performance on the EEGEyeNet Absolute Position task. It achieves a new benchmark RMSE of $51.6 \pm 0.2$ mm, surpassing the prior $55.4 \pm 0.2$ mm, thus demonstrating the value of targeted pre-processing and architectural refinements for EEG-based gaze estimation. The work has practical implications for EEG-based brain-computer interfaces and emphasizes the importance of data-quality improvements and efficient neural architectures in neural decoding tasks.

Abstract

In the field of EEG-based gaze prediction, the application of deep learning to interpret complex neural data poses significant challenges. This study evaluates the effectiveness of pre-processing techniques and the effect of additional depthwise separable convolution on EEG vision transformers (ViTs) in a pretrained model architecture. We introduce a novel method, the EEG Deeper Clustered Vision Transformer (EEG-DCViT), which combines depthwise separable convolutional neural networks (CNNs) with vision transformers, enriched by a pre-processing strategy involving data clustering. The new approach demonstrates superior performance, establishing a new benchmark with a Root Mean Square Error (RMSE) of 51.6 mm. This achievement underscores the impact of pre-processing and model refinement in enhancing EEG-based applications.

Advancing EEG-Based Gaze Prediction Using Depthwise Separable Convolution and Enhanced Pre-Processing

TL;DR

mm, surpassing the prior

mm, thus demonstrating the value of targeted pre-processing and architectural refinements for EEG-based gaze estimation. The work has practical implications for EEG-based brain-computer interfaces and emphasizes the importance of data-quality improvements and efficient neural architectures in neural decoding tasks.

Abstract

Paper Structure (20 sections, 8 figures, 2 tables)

This paper contains 20 sections, 8 figures, 2 tables.

1 Introduction
1.1 Research Questions
2 Related Work
3 Methods
Data Pre-Processing:
Depthwise-separable convolutional neural networks (DS-CNNs):
Evaluation Metrics:
Early Stopping:
Method 1: EEGViT Trained with DS-CNNs:
Method 2: EEGViT Trained with Clustered Data:
Method 3 (EEG-DCViT): EEGViT Trained with Clustered and DS-CNNs:
4 Dataset
5 Results
6 Discussion
Computational Complexity:
...and 5 more sections

Figures (8)

Figure 1: Large Grid Experimental Setup: This image illustrates the schematic view of the experimental setup and the stimuli placement on the screen. It gives a visual representation of how participants interacted with the stimuli during the eye-tracking events kastrati2021eegeyenet.
Figure 2: Clustering illustrates the discrepancy between labeled positions and actual target positions.
Figure 3: The centroids used to correct training data labels.
Figure 4: EEG Vision Transformer with Depthwise Separable Convolution A specialized ViT structure tailored for raw EEG signal input. This architecture utilizes a quad-step convolution process to produce patch embeddings. The dotted outline highlights the depthwise separable convolution. After this initial step, positional embeddings are integrated and the combined sequence is subsequently passed through the ViT layers midterm-eeg-vit. The design of the positional embedding and ViT layer is adapted from dosovitskiy2021image.
Figure 5: Classification Performance Metrics by Cluster: This figure presents a detailed breakdown of classification metrics including precision, recall, F1-score, and support for 25 clusters, highlighting the performance of each cluster in the model evaluation.
...and 3 more figures

Advancing EEG-Based Gaze Prediction Using Depthwise Separable Convolution and Enhanced Pre-Processing

TL;DR

Abstract

Advancing EEG-Based Gaze Prediction Using Depthwise Separable Convolution and Enhanced Pre-Processing

Authors

TL;DR

Abstract

Table of Contents

Figures (8)