Table of Contents
Fetching ...

ASPEN: Spectral-Temporal Fusion for Cross-Subject Brain Decoding

Megan Lee, Seung Ha Hwang, Inhyeok Choi, Shreyas Darade, Mengchun Zhang, Kateryna Shapovalenko

TL;DR

ASEN is introduced, a hybrid architecture that combines spectral and temporal feature streams via multiplicative fusion, requiring cross-modal agreement for features to propagate, demonstrating that multiplicative multimodal fusion enables effective cross-subject generalization.

Abstract

Cross-subject generalization in EEG-based brain-computer interfaces (BCIs) remains challenging due to individual variability in neural signals. We investigate whether spectral representations offer more stable features for cross-subject transfer than temporal waveforms. Through correlation analyses across three EEG paradigms (SSVEP, P300, and Motor Imagery), we find that spectral features exhibit consistently higher cross-subject similarity than temporal signals. Motivated by this observation, we introduce ASPEN, a hybrid architecture that combines spectral and temporal feature streams via multiplicative fusion, requiring cross-modal agreement for features to propagate. Experiments across six benchmark datasets reveal that ASPEN is able to dynamically achieve the optimal spectral-temporal balance depending on the paradigm. ASPEN achieves the best unseen-subject accuracy on three of six datasets and competitive performance on others, demonstrating that multiplicative multimodal fusion enables effective cross-subject generalization.

ASPEN: Spectral-Temporal Fusion for Cross-Subject Brain Decoding

TL;DR

ASEN is introduced, a hybrid architecture that combines spectral and temporal feature streams via multiplicative fusion, requiring cross-modal agreement for features to propagate, demonstrating that multiplicative multimodal fusion enables effective cross-subject generalization.

Abstract

Cross-subject generalization in EEG-based brain-computer interfaces (BCIs) remains challenging due to individual variability in neural signals. We investigate whether spectral representations offer more stable features for cross-subject transfer than temporal waveforms. Through correlation analyses across three EEG paradigms (SSVEP, P300, and Motor Imagery), we find that spectral features exhibit consistently higher cross-subject similarity than temporal signals. Motivated by this observation, we introduce ASPEN, a hybrid architecture that combines spectral and temporal feature streams via multiplicative fusion, requiring cross-modal agreement for features to propagate. Experiments across six benchmark datasets reveal that ASPEN is able to dynamically achieve the optimal spectral-temporal balance depending on the paradigm. ASPEN achieves the best unseen-subject accuracy on three of six datasets and competitive performance on others, demonstrating that multiplicative multimodal fusion enables effective cross-subject generalization.
Paper Structure (26 sections, 34 equations, 5 figures, 15 tables)

This paper contains 26 sections, 34 equations, 5 figures, 15 tables.

Figures (5)

  • Figure 1: ASPEN architecture. Raw EEG is processed through parallel temporal and spectral streams, combined via multiplicative fusion before classification.
  • Figure 2: Cross-session (left) and cross-subject (right) correlation comparison between temporal and spectral representations. Spectral features exhibit consistently higher cross-subject similarity across all datasets.
  • Figure 3: Detailed view of temporal stream, spectral stream, and multiplicative fusion components.
  • Figure 4: Stream contributions and feature correlation ($\rho$) across datasets. Low correlation values confirm that streams capture distinct information.
  • Figure 5: Grad-CAM visualization of feature importance for P300 classification. Correct prediction (top) shows focused attention on physiologically relevant low-frequency bands. Misclassification (bottom) reveals scattered attention towards high-frequency noise artifacts.