Table of Contents
Fetching ...

BrainStratify: Coarse-to-Fine Disentanglement of Intracranial Neural Dynamics

Hui Zheng, Hai-Teng Wang, Yi-Tao Jing, Pei-Yang Lin, Han-Qing Zhao, Wei Chen, Peng-Hu Wei, Yong-Zhi Shan, Guo-Guang Zhao, Yun-Zhe Liu

TL;DR

BrainStratify tackles speech decoding from invasive intracranial recordings by proposing a two-stage coarse-to-fine disentanglement framework. The coarse stage identifies functional channel groups via a patch-based temporal-spatial backbone and a spatial-context pre-training task, followed by spectral clustering of inter-channel attention to form groups. The fine stage introduces Decoupled Product Quantization to learn independent, fine-grained neural states within those groups, aided by a DPQ-guided masked modeling objective. Across sEEG and epidural ECoG datasets, BrainStratify achieves state-of-the-art word and syllable decoding performance with interpretable functional groupings and data-efficient training, suggesting strong potential for clinically viable neuroprosthetics.

Abstract

Decoding speech directly from neural activity is a central goal in brain-computer interface (BCI) research. In recent years, exciting advances have been made through the growing use of intracranial field potential recordings, such as stereo-ElectroEncephaloGraphy (sEEG) and ElectroCorticoGraphy (ECoG). These neural signals capture rich population-level activity but present key challenges: (i) task-relevant neural signals are sparsely distributed across sEEG electrodes, and (ii) they are often entangled with task-irrelevant neural signals in both sEEG and ECoG. To address these challenges, we introduce a unified Coarse-to-Fine neural disentanglement framework, BrainStratify, which includes (i) identifying functional groups through spatial-context-guided temporal-spatial modeling, and (ii) disentangling distinct neural dynamics within the target functional group using Decoupled Product Quantization (DPQ). We evaluate BrainStratify on two open-source sEEG datasets and one (epidural) ECoG dataset, spanning tasks like vocal production and speech perception. Extensive experiments show that BrainStratify, as a unified framework for decoding speech from intracranial neural signals, significantly outperforms previous decoding methods. Overall, by combining data-driven stratification with neuroscience-inspired modularity, BrainStratify offers a robust and interpretable solution for speech decoding from intracranial recordings.

BrainStratify: Coarse-to-Fine Disentanglement of Intracranial Neural Dynamics

TL;DR

BrainStratify tackles speech decoding from invasive intracranial recordings by proposing a two-stage coarse-to-fine disentanglement framework. The coarse stage identifies functional channel groups via a patch-based temporal-spatial backbone and a spatial-context pre-training task, followed by spectral clustering of inter-channel attention to form groups. The fine stage introduces Decoupled Product Quantization to learn independent, fine-grained neural states within those groups, aided by a DPQ-guided masked modeling objective. Across sEEG and epidural ECoG datasets, BrainStratify achieves state-of-the-art word and syllable decoding performance with interpretable functional groupings and data-efficient training, suggesting strong potential for clinically viable neuroprosthetics.

Abstract

Decoding speech directly from neural activity is a central goal in brain-computer interface (BCI) research. In recent years, exciting advances have been made through the growing use of intracranial field potential recordings, such as stereo-ElectroEncephaloGraphy (sEEG) and ElectroCorticoGraphy (ECoG). These neural signals capture rich population-level activity but present key challenges: (i) task-relevant neural signals are sparsely distributed across sEEG electrodes, and (ii) they are often entangled with task-irrelevant neural signals in both sEEG and ECoG. To address these challenges, we introduce a unified Coarse-to-Fine neural disentanglement framework, BrainStratify, which includes (i) identifying functional groups through spatial-context-guided temporal-spatial modeling, and (ii) disentangling distinct neural dynamics within the target functional group using Decoupled Product Quantization (DPQ). We evaluate BrainStratify on two open-source sEEG datasets and one (epidural) ECoG dataset, spanning tasks like vocal production and speech perception. Extensive experiments show that BrainStratify, as a unified framework for decoding speech from intracranial neural signals, significantly outperforms previous decoding methods. Overall, by combining data-driven stratification with neuroscience-inspired modularity, BrainStratify offers a robust and interpretable solution for speech decoding from intracranial recordings.

Paper Structure

This paper contains 68 sections, 6 equations, 15 figures, 27 tables, 1 algorithm.

Figures (15)

  • Figure 1: The 61-word performance on Du-IN zheng2025discrete dataset using top-10 channels selected via the MC strategy across varying numbers of labeled samples.
  • Figure 2: Overview of BrainStratify framework.(a). Coarse Disentanglement Learning Stage (BrainStratify-Coarse). (b). Fine Disentanglement Learning Stage (BrainStratify-Fine).
  • Figure 3: The channel connectivity from different methods.
  • Figure 4: Ablation study on different codex groups, codex sizes, and codex dimensions. We report the 61-word performance on Du-IN dataset zheng2025discrete; see Appendix \ref{['sec:supp-model-details']} for details of model variants.
  • Figure 5: Overview of ECoG configuration.(a). The implant configuration. Our developed (epidural) ECoG is placed above vSMC, which is involved in vocal production silva2024speech. (b). The channel resistance. Electrodes at the four corners are excluded for downstream analysis.
  • ...and 10 more figures