Subject Disentanglement Neural Network for Speech Envelope Reconstruction from EEG
Li Zhang, Jiyao Liu
TL;DR
The paper tackles cross-subject variability in EEG-based speech envelope reconstruction. It introduces SDN-Net, a three-module architecture comprising MLA-Codec, CTA-MTDNN, and MPN-MI to achieve subject-independent envelope decoding. MLA-Codec decodes envelopes from EEG; CTA-MTDNN extracts subject identity with multi-scale time-delay and attention; MPN-MI enforces independence via variational mutual information estimation and a joint objective. On the Auditory EEG Decoding Dataset, SDN-Net outperforms state-of-the-art methods for both inner- and cross-subject reconstruction, advancing cross-subject generalization in neural speech decoding with potential for robust brain-computer interfaces.
Abstract
Reconstructing speech envelopes from EEG signals is essential for exploring neural mechanisms underlying speech perception. Yet, EEG variability across subjects and physiological artifacts complicate accurate reconstruction. To address this problem, we introduce Subject Disentangling Neural Network (SDN-Net), which disentangles subject identity information from reconstructed speech envelopes to enhance cross-subject reconstruction accuracy. SDN-Net integrates three key components: MLA-Codec, MPN-MI, and CTA-MTDNN. The MLA-Codec, a fully convolutional neural network, decodes EEG signals into speech envelopes. The CTA-MTDNN module, a multi-scale time-delay neural network with channel and temporal attention, extracts subject identity features from EEG signals. Lastly, the MPN-MI module, a mutual information estimator with a multi-layer perceptron, supervises the removal of subject identity information from the reconstructed speech envelope. Experiments on the Auditory EEG Decoding Dataset demonstrate that SDN-Net achieves superior performance in inner- and cross-subject speech envelope reconstruction compared to recent state-of-the-art methods.
