Table of Contents
Fetching ...

GCMCG: A Clustering-Aware Graph Attention and Expert Fusion Network for Multi-Paradigm, Multi-task, and Cross-Subject EEG Decoding

Yiqiao Chen, Zijian Huang, Juchi He, Fazheng Xu, Zhenghui Feng

TL;DR

The paper tackles robust decoding of MI-ME EEG signals across subjects and paradigms by introducing GCMCG, a graph-guided, clustering-aware Mixture-of-Experts framework. It combines ICA-WT denoising, a learnable electrode tokenizer with Graph Attention networks, spectral clustering to form functional regions, region-specific CNN-GRU experts, and entropy-regularized MoE fusion, all trained with a three-stage strategy to improve cross-subject generalization. Empirical results across EEGmmidb, BCIC-IV 2a, and M3CV show state-of-the-art performance on challenging datasets and strong cross-dataset stability, along with extensive ablations and interpretability analyses. The work advances practical BCI by enabling flexible electrode layouts, variable sequence lengths, and robust multi-paradigm decoding, paving the way for real-world, plug-and-play EEG-based control systems.

Abstract

Brain-Computer Interfaces (BCIs) based on Motor Execution (ME) and Motor Imagery (MI) electroencephalogram (EEG) signals offer a direct pathway for human-machine interaction. However, developing robust decoding models remains challenging due to the complex spatio-temporal dynamics of EEG, its low signal-to-noise ratio, and the limited generalizability of many existing approaches across subjects and paradigms. To address these issues, this paper proposes Graph-guided Clustering Mixture-of-Experts CNN-GRU (GCMCG), a novel unified framework for MI-ME EEG decoding. Our approach integrates a robust preprocessing stage using Independent Component Analysis and Wavelet Transform (ICA-WT) for effective denoising. We further introduce a pre-trainable graph tokenization module that dynamically models electrode relationships via a Graph Attention Network (GAT), followed by unsupervised spectral clustering to decompose signals into interpretable functional brain regions. Each region is processed by a dedicated CNN-GRU expert network, and a gated fusion mechanism with L1 regularization adaptively combines these local features with a global expert. This Mixture-of-Experts (MoE) design enables deep spatio-temporal fusion and enhances representational capacity. A three-stage training strategy incorporating focal loss and progressive sampling is employed to improve cross-subject generalization and handle class imbalance. Evaluated on three public datasets of varying complexity (EEGmmidb-BCI2000, BCI-IV 2a, and M3CV), GCMCG achieves overall accuracies of 86.60%, 98.57%, and 99.61%, respectively, which demonstrates its superior effectiveness and strong generalization capability for practical BCI applications.

GCMCG: A Clustering-Aware Graph Attention and Expert Fusion Network for Multi-Paradigm, Multi-task, and Cross-Subject EEG Decoding

TL;DR

The paper tackles robust decoding of MI-ME EEG signals across subjects and paradigms by introducing GCMCG, a graph-guided, clustering-aware Mixture-of-Experts framework. It combines ICA-WT denoising, a learnable electrode tokenizer with Graph Attention networks, spectral clustering to form functional regions, region-specific CNN-GRU experts, and entropy-regularized MoE fusion, all trained with a three-stage strategy to improve cross-subject generalization. Empirical results across EEGmmidb, BCIC-IV 2a, and M3CV show state-of-the-art performance on challenging datasets and strong cross-dataset stability, along with extensive ablations and interpretability analyses. The work advances practical BCI by enabling flexible electrode layouts, variable sequence lengths, and robust multi-paradigm decoding, paving the way for real-world, plug-and-play EEG-based control systems.

Abstract

Brain-Computer Interfaces (BCIs) based on Motor Execution (ME) and Motor Imagery (MI) electroencephalogram (EEG) signals offer a direct pathway for human-machine interaction. However, developing robust decoding models remains challenging due to the complex spatio-temporal dynamics of EEG, its low signal-to-noise ratio, and the limited generalizability of many existing approaches across subjects and paradigms. To address these issues, this paper proposes Graph-guided Clustering Mixture-of-Experts CNN-GRU (GCMCG), a novel unified framework for MI-ME EEG decoding. Our approach integrates a robust preprocessing stage using Independent Component Analysis and Wavelet Transform (ICA-WT) for effective denoising. We further introduce a pre-trainable graph tokenization module that dynamically models electrode relationships via a Graph Attention Network (GAT), followed by unsupervised spectral clustering to decompose signals into interpretable functional brain regions. Each region is processed by a dedicated CNN-GRU expert network, and a gated fusion mechanism with L1 regularization adaptively combines these local features with a global expert. This Mixture-of-Experts (MoE) design enables deep spatio-temporal fusion and enhances representational capacity. A three-stage training strategy incorporating focal loss and progressive sampling is employed to improve cross-subject generalization and handle class imbalance. Evaluated on three public datasets of varying complexity (EEGmmidb-BCI2000, BCI-IV 2a, and M3CV), GCMCG achieves overall accuracies of 86.60%, 98.57%, and 99.61%, respectively, which demonstrates its superior effectiveness and strong generalization capability for practical BCI applications.

Paper Structure

This paper contains 25 sections, 24 equations, 11 figures, 5 tables, 3 algorithms.

Figures (11)

  • Figure 1: Overview of the proposed GCMCG framework. The green modules denote real input signals and electrode metadata. Learnable parameters are annotated in light gray with Update: $\cdot$ for each module.
  • Figure 2: Eight-connected graph structure centered on $\text{FC}_1$ and $\text{C}_z$, illustrating spatial electrode neighborhoods used for EEG graph construction.
  • Figure 3: Schematic Diagram of the Hybrid Denoising Method. In the first stage, a notch filter and a second-order Butterworth bandpass filter are applied to eliminate external artifacts. In the second stage, physiological artifacts are eliminated using the ICA-WT algorithm (where $cA$ denotes wavelet approximation coefficients and $cD$ represents wavelet detail coefficients).
  • Figure 4: This figure illustrates the data-slicing process applied to the EEG recordings of subject S001 during Task 3 in the EEGmmidb dataset. The continuous EEG signals are sliced into segments and labeled according to the annotation file (T0: rest; T1: opening and closing of the left fist; T2: opening and closing of the right fist). Only the signals from the first four channels and the last channel are displayed.
  • Figure 5: Illustration of EEG signal denoising using frequency-domain filtering followed by the ICA–WT hybrid method. A representative EEG channel is shown to compare the original signal (blue), the frequency-filtered signal (green), and the fully denoised signal (red). The sequential application of frequency filtering and ICA–WT denoising effectively suppresses noise while preserving the essential signal information.
  • ...and 6 more figures