Geometry of orofacial neuromuscular signals: speech articulation decoding using surface electromyography
Harshavardhana T. Gowda, Zachary D. McNaughton, Lee M. Miller
TL;DR
This work tackles decoding speech articulations from multichannel surface EMG by structuring EMG signals as edge covariances that lie on the SPD manifold. It introduces a geometry-aware pipeline, including Cholesky-based distances, Fréchet means, k-medoids/MDM clustering, and SPD-network architectures (SPDNet and a manifold-aware GRU) to decode gestures, phonemes, and words, plus a NATO-based spelling paradigm. A key contribution is showing that EMG embeddings on the SPD manifold are highly structured and discriminative, enabling data-efficient decoding with small training sets while revealing important cross-subject distribution shifts modeled as changes of basis. The open-source dataset (16 subjects) and code, along with demonstrations of phoneme- and word-level decoding using only ES, establish a foundation for practical EMG-to-language translation and inform model design for subject variability.
Abstract
Objective. In this article, we present data and methods for decoding speech articulations using surface electromyogram (EMG) signals. EMG-based speech neuroprostheses offer a promising approach for restoring audible speech in individuals who have lost the ability to speak intelligibly due to laryngectomy, neuromuscular diseases, stroke, or trauma-induced damage (e.g., from radiotherapy) to the speech articulators. Approach. To achieve this, we collect EMG signals from the face, jaw, and neck as subjects articulate speech, and we perform EMG-to-speech translation. Main results. Our findings reveal that the manifold of symmetric positive definite (SPD) matrices serves as a natural embedding space for EMG signals. Specifically, we provide an algebraic interpretation of the manifold-valued EMG data using linear transformations, and we analyze and quantify distribution shifts in EMG signals across individuals. Significance. Overall, our approach demonstrates significant potential for developing neural networks that are both data- and parameter-efficient, an important consideration for EMG-based systems, which face challenges in large-scale data collection and operate under limited computational resources on embedded devices.
