Table of Contents
Fetching ...

Toward Robust, Reproducible, and Widely Accessible Intracranial Language Brain-Computer Interfaces: A Comprehensive Review of Neural Mechanisms, Hardware, Algorithms, Evaluation, Clinical Pathways and Future Directions

Dongyi He, Wai Ting Siok, Nizhuan Wang

Abstract

Intracranial language brain-computer interfaces (BCIs) are a promising route for restoring communication in people with severe motor and speech impairments, but clinical translation remains limited by fragmented evidence and unresolved design trade-offs across neuroscience, hardware, algorithm, evaluation, and clinical deployment. This review synthesizes progress in neural mechanisms of overt, mimed, and imagined speech; decision-oriented hardware comparisons of microelectrode array (MEA), electrocorticography (ECoG), and stereotactic electroencephalography (SEEG) recording modalities; experiment design for cross-subject and multilingual generalization; and neural decoding advances spanning sequence models, transformers, articulatory intermediates, and language-prior-assisted frameworks. We highlight persistent bottlenecks, including weak cross-subject transfer, long-term non-stationarity and recalibration burden, heterogeneous and non-comparable evaluation practices, limited naturalistic expressivity (especially for tonal/logosyllabic languages), and low signal-to-noise ratio (SNR) of neural activity in covert speech decoding. Our contributions are threefold: (1) an end-to-end, decision-oriented synthesis linking neural representations to recording choices, experimental design, decoding model architectures, and translational constraints; (2) a structured framework organized around five coupled design questions, together with a unified evaluation framework and a cross-language/cross-task benchmark template integrating objective, perceptual, expressive, conversational, and longitudinal metrics; and (3) user-centered translational guidance covering agency-preserving shared control, verifiable performance priorities, and scenario-specific minimum viable system (MVP) profiles for reliability-first home communication versus fidelity-first conversational speech restoration.

Toward Robust, Reproducible, and Widely Accessible Intracranial Language Brain-Computer Interfaces: A Comprehensive Review of Neural Mechanisms, Hardware, Algorithms, Evaluation, Clinical Pathways and Future Directions

Abstract

Intracranial language brain-computer interfaces (BCIs) are a promising route for restoring communication in people with severe motor and speech impairments, but clinical translation remains limited by fragmented evidence and unresolved design trade-offs across neuroscience, hardware, algorithm, evaluation, and clinical deployment. This review synthesizes progress in neural mechanisms of overt, mimed, and imagined speech; decision-oriented hardware comparisons of microelectrode array (MEA), electrocorticography (ECoG), and stereotactic electroencephalography (SEEG) recording modalities; experiment design for cross-subject and multilingual generalization; and neural decoding advances spanning sequence models, transformers, articulatory intermediates, and language-prior-assisted frameworks. We highlight persistent bottlenecks, including weak cross-subject transfer, long-term non-stationarity and recalibration burden, heterogeneous and non-comparable evaluation practices, limited naturalistic expressivity (especially for tonal/logosyllabic languages), and low signal-to-noise ratio (SNR) of neural activity in covert speech decoding. Our contributions are threefold: (1) an end-to-end, decision-oriented synthesis linking neural representations to recording choices, experimental design, decoding model architectures, and translational constraints; (2) a structured framework organized around five coupled design questions, together with a unified evaluation framework and a cross-language/cross-task benchmark template integrating objective, perceptual, expressive, conversational, and longitudinal metrics; and (3) user-centered translational guidance covering agency-preserving shared control, verifiable performance priorities, and scenario-specific minimum viable system (MVP) profiles for reliability-first home communication versus fidelity-first conversational speech restoration.
Paper Structure (31 sections, 5 equations, 8 figures, 4 tables)

This paper contains 31 sections, 5 equations, 8 figures, 4 tables.

Figures (8)

  • Figure 1: Lane-based timeline of representative intracranial language decoding frameworks (selected representatives from Table \ref{['tab:decoding-method-review-main']}). The frameworks are grouped into three categories and ordered by publication year. Each card includes the representative studies with a short framework label and technical tags summarizing key methods.
  • Figure 2: Schematic of intracranial language BCI decoding pipeline with hardware-software co-design. Neural activity from ECoG/EEG/MEA recordings is transformed into neural features and decoded into phonemes, text, speech features, or synthesized speech; hardware design largely constrains the performance ceiling, whereas software design determines how closely it is approached.
  • Figure 3: Mechanism–Recording–Model–Task framework for language decoding. Neural speech representations (somatotopy with mixed tuning, dorsal–ventral streams, overt–mimed–imagined SNR gradient, and temporal population dynamics) shape measurable multielectrode signals across distributed regions and frequency bands using ECoG, SEEG and MEAs. Decoders align with these signal properties via multibranch feature extraction, sequence modeling (CTC/RNN/Transformer), intermediate representations (phonemes, pitch, formants), and subject adaptation, enabling tasks from articulation/phoneme/word classification to continuous sentence decoding and speech waveform reconstruction.
  • Figure 4: Schematic illustration of a practical selection heuristic for intracranial recording modalities. Macro-ECoG/SEEG prioritize coverage and deployment practicality, $\mu$ECoG prioritizes cortical spatial detail, and intracortical MEAs prioritize maximal local information density and high-rate decoding at higher implantation/maintenance complexity.
  • Figure 5: Conceptual landscape of language BCI decoding models. Decoding approaches are positioned qualitatively along two axes: biological interpretability and population-level generalization. Direct acoustic models typically offer limited interpretability and cross-subject robustness, whereas articulatory intermediates and topology-agnostic transformers occupy the high-generalization regime. Dual-path frameworks integrate acoustic and linguistic representations, and language-prior–assisted models improve large-vocabulary performance but may introduce bias. Positions reflect conceptual trends rather than quantitative benchmarking.
  • ...and 3 more figures