MEGState: Phoneme Decoding from Magnetoencephalography Signals
Shuntaro Suzuki, Chia-Chun Dan Hsu, Yu Tsao, Komei Sugiura
TL;DR
This work introduces MEGState, a non-invasive phoneme decoding framework for MEG signals that fuses a Multi-Resolution Convolution module with a Sensor-wise SSM to capture both fine-grained local dynamics and global sensor-wide temporal dependencies. Grounded in state-space modeling with S5/HiPPO-N, the architecture enables efficient long-range temporal modeling and is trained end-to-end on the LibriBrain dataset with a novel sampling-based augmentation strategy. Empirical results show robust improvements over baselines across accuracy, Cohen’s kappa, and macro-F1, with additional gains from ensemble predictions on a leaderboard, underscoring MEG-based phoneme decoding as a viable pathway for non-invasive speech BCIs. The approach offers a scalable framework for leveraging MEG to reconstruct linguistically meaningful representations without invasive procedures.
Abstract
Decoding linguistically meaningful representations from non-invasive neural recordings remains a central challenge in neural speech decoding. Among available neuroimaging modalities, magnetoencephalography (MEG) provides a safe and repeatable means of mapping speech-related cortical dynamics, yet its low signal-to-noise ratio and high temporal dimensionality continue to hinder robust decoding. In this work, we introduce MEGState, a novel architecture for phoneme decoding from MEG signals that captures fine-grained cortical responses evoked by auditory stimuli. Extensive experiments on the LibriBrain dataset demonstrate that MEGState consistently surpasses baseline model across multiple evaluation metrics. These findings highlight the potential of MEG-based phoneme decoding as a scalable pathway toward non-invasive brain-computer interfaces for speech.
