MEG-to-MEG Transfer Learning and Cross-Task Speech/Silence Detection with Limited Data
Xabier de Zuazo, Vincenzo Verbeni, Eva Navas, Ibon Saratxaga, Mathieu Bourguignon, Nicola Molinaro
TL;DR
This work tackles data-efficiency in MEG-based speech decoding by pre-training a Conformer-style model on 50 hours of single-subject listening MEG data and fine-tuning it on only about 5 minutes per subject across 18 participants for perception and production tasks. The key finding is that large-scale pre-training improves both in-task performance and cross-task generalization, enabling cross-task decoding between listening, playback, and production, with cross-task gains up to 5–6%. Importantly, production-trained models can decode passive listening above chance, indicating shared neural representations beyond task-specific motor activity. The study advances practical MEG-based neurotechnologies by demonstrating data-efficient transfer learning and highlighting asymmetries in cross-task transfer that reflect motor planning involvement in production.
Abstract
Data-efficient neural decoding is a central challenge for speech brain-computer interfaces. We present the first demonstration of transfer learning and cross-task decoding for MEG-based speech models spanning perception and production. We pre-train a Conformer-based model on 50 hours of single-subject listening data and fine-tune on just 5 minutes per subject across 18 participants. Transfer learning yields consistent improvements, with in-task accuracy gains of 1-4% and larger cross-task gains of up to 5-6%. Not only does pre-training improve performance within each task, but it also enables reliable cross-task decoding between perception and production. Critically, models trained on speech production decode passive listening above chance, confirming that learned representations reflect shared neural processes rather than task-specific motor activity.
