Table of Contents
Fetching ...

CIPHER: Conformer-based Inference of Phonemes from High-density EEG

Varshith Madishetty

Abstract

Decoding speech information from scalp EEG remains difficult due to low SNR and spatial blurring. We present CIPHER (Conformer-based Inference of Phonemes from High-density EEG Representations), a dual-pathway model using (i) ERP features and (ii) broadband DDA coefficients. On OpenNeuro ds006104 (24 participants, two studies with concurrent TMS), binary articulatory tasks reach near-ceiling performance but are highly confound-vulnerable (acoustic onset separability and TMS-target blocking). On the primary 11-class CVC phoneme task under full Study 2 LOSO (16 held-out subjects), performance is substantially lower (real-word WER: ERP 0.671 +/- 0.080, DDA 0.688 +/- 0.096, indicating limited fine-grained discriminability. We therefore position this work as a benchmark and feature-comparison study rather than an EEG-to-text system, and we constrain neural-representation claims to confound-controlled evidence.

CIPHER: Conformer-based Inference of Phonemes from High-density EEG

Abstract

Decoding speech information from scalp EEG remains difficult due to low SNR and spatial blurring. We present CIPHER (Conformer-based Inference of Phonemes from High-density EEG Representations), a dual-pathway model using (i) ERP features and (ii) broadband DDA coefficients. On OpenNeuro ds006104 (24 participants, two studies with concurrent TMS), binary articulatory tasks reach near-ceiling performance but are highly confound-vulnerable (acoustic onset separability and TMS-target blocking). On the primary 11-class CVC phoneme task under full Study 2 LOSO (16 held-out subjects), performance is substantially lower (real-word WER: ERP 0.671 +/- 0.080, DDA 0.688 +/- 0.096, indicating limited fine-grained discriminability. We therefore position this work as a benchmark and feature-comparison study rather than an EEG-to-text system, and we constrain neural-representation claims to confound-controlled evidence.

Paper Structure

This paper contains 37 sections, 12 equations, 9 figures, 13 tables.

Figures (9)

  • Figure 1: CIPHER dual-pathway architecture. Raw EEG is processed in parallel through an ERP pathway (narrowband, phase-locked dynamics) and a DDA pathway (broadband nonlinear dynamical coefficients), fused by a shared Conformer encoder, and decoded by four task-specific classification heads.
  • Figure 2: WER comparison across ERP/DDA and real/pseudoword conditions.
  • Figure 3: Matched-split baseline accuracy heatmap.
  • Figure 4: Real vs. pseudoword lexicality effect (non-significant).
  • Figure 5: TMS ANOVA $p$-values by feature and place stratum.
  • ...and 4 more figures