BrainStratify: Coarse-to-Fine Disentanglement of Intracranial Neural Dynamics
Hui Zheng, Hai-Teng Wang, Yi-Tao Jing, Pei-Yang Lin, Han-Qing Zhao, Wei Chen, Peng-Hu Wei, Yong-Zhi Shan, Guo-Guang Zhao, Yun-Zhe Liu
TL;DR
BrainStratify tackles speech decoding from invasive intracranial recordings by proposing a two-stage coarse-to-fine disentanglement framework. The coarse stage identifies functional channel groups via a patch-based temporal-spatial backbone and a spatial-context pre-training task, followed by spectral clustering of inter-channel attention to form groups. The fine stage introduces Decoupled Product Quantization to learn independent, fine-grained neural states within those groups, aided by a DPQ-guided masked modeling objective. Across sEEG and epidural ECoG datasets, BrainStratify achieves state-of-the-art word and syllable decoding performance with interpretable functional groupings and data-efficient training, suggesting strong potential for clinically viable neuroprosthetics.
Abstract
Decoding speech directly from neural activity is a central goal in brain-computer interface (BCI) research. In recent years, exciting advances have been made through the growing use of intracranial field potential recordings, such as stereo-ElectroEncephaloGraphy (sEEG) and ElectroCorticoGraphy (ECoG). These neural signals capture rich population-level activity but present key challenges: (i) task-relevant neural signals are sparsely distributed across sEEG electrodes, and (ii) they are often entangled with task-irrelevant neural signals in both sEEG and ECoG. To address these challenges, we introduce a unified Coarse-to-Fine neural disentanglement framework, BrainStratify, which includes (i) identifying functional groups through spatial-context-guided temporal-spatial modeling, and (ii) disentangling distinct neural dynamics within the target functional group using Decoupled Product Quantization (DPQ). We evaluate BrainStratify on two open-source sEEG datasets and one (epidural) ECoG dataset, spanning tasks like vocal production and speech perception. Extensive experiments show that BrainStratify, as a unified framework for decoding speech from intracranial neural signals, significantly outperforms previous decoding methods. Overall, by combining data-driven stratification with neuroscience-inspired modularity, BrainStratify offers a robust and interpretable solution for speech decoding from intracranial recordings.
