MEGConformer: Conformer-Based MEG Decoder for Robust Speech and Phoneme Classification
Xabier de Zuazo, Ibon Saratxaga, Eva Navas
TL;DR
This work investigates decoding speech and phoneme information from MEG signals using a Conformer-based decoder trained on LibriBrain PNPL tasks. A unified MEG backbone with task-specific heads, MEG-tailored augmentation, and robust normalization enable competitive performance on Speech Detection and Phoneme Classification within the Standard track. Key findings show strong holdout results (88.9% Speech F1-macro, 65.8% Phoneme F1-macro) with critical contributions from input window length, dynamic grouping, class weighting, and instance normalization. The study demonstrates the viability of adapting ASR architectures to MEG decoding and highlights directions for end-to-end reconstruction and larger-scale MEG datasets.
Abstract
We present Conformer-based decoders for the LibriBrain 2025 PNPL competition, targeting two foundational MEG tasks: Speech Detection and Phoneme Classification. Our approach adapts a compact Conformer to raw 306-channel MEG signals, with a lightweight convolutional projection layer and task-specific heads. For Speech Detection, a MEG-oriented SpecAugment provided a first exploration of MEG-specific augmentation. For Phoneme Classification, we used inverse-square-root class weighting and a dynamic grouping loader to handle 100-sample averaged examples. In addition, a simple instance-level normalization proved critical to mitigate distribution shifts on the holdout split. Using the official Standard track splits and F1-macro for model selection, our best systems achieved 88.9% (Speech) and 65.8% (Phoneme) on the leaderboard, surpassing the competition baselines and ranking within the top-10 in both tasks. For further implementation details, the technical documentation, source code, and checkpoints are available at https://github.com/neural2speech/libribrain-experiments.
