MECASA: Motor Execution Classification using Additive Self-Attention for Hybrid EEG-fNIRS Data
Gourav Siddhad, Juhi Singh, Partha Pratim Roy
TL;DR
This work addresses motor execution state classification by leveraging multimodal EEG and fNIRS data. It introduces MECASA, a convolutional additive self-attention architecture built on the CASA module within a CAS‑ViT backbone and a dedicated fusion network to integrate EEG and fNIRS features. Across experiments on the SMR Hybrid BCI dataset, MECASA consistently outperforms unimodal and other multimodal baselines, with fNIRS contributing more than EEG and fusion yielding the highest accuracy. The findings demonstrate the viability of deep learning–driven EEG–fNIRS fusion for practical BCI applications, including potential real-time motor rehabilitation and cognitive assessment tasks, and provide guidance on data representations and embedding dimensions for optimal performance.
Abstract
Motor execution, a fundamental aspect of human behavior, has been extensively studied using BCI technologies. EEG and fNIRS have been utilized to provide valuable insights, but their individual limitations have hindered performance. This study investigates the effectiveness of fusing electroencephalography (EEG) and functional near-infrared spectroscopy (fNIRS) data for classifying rest versus task states in a motor execution paradigm. Using the SMR Hybrid BCI dataset, this work compares unimodal (EEG and fNIRS) classifiers with a multimodal fusion approach. It proposes Motor Execution using Convolutional Additive Self-Attention Mechanisms (MECASA), a novel architecture leveraging convolutional operations and self-attention to capture complex patterns in multimodal data. MECASA, built upon the CAS-ViT architecture, employs a computationally efficient, convolutional-based self-attention module (CASA), a hybrid block design, and a dedicated fusion network to combine features from separate EEG and fNIRS processing streams. Experimental results demonstrate that MECASA consistently outperforms established methods across all modalities (EEG, fNIRS, and fused), with fusion consistently improving accuracy compared to single-modality approaches. fNIRS generally achieved higher accuracy than EEG alone. Ablation studies revealed optimal configurations for MECASA, with embedding dimensions of 64-128 providing the best performance for EEG data and OD128 (upsampled optical density) yielding superior results for fNIRS data. This work highlights the potential of deep learning, specifically MECASA, to enhance EEG-fNIRS fusion for BCI applications.
