Table of Contents
Fetching ...

Affect Decoding in Phonated and Silent Speech Production from Surface EMG

Simon Pistrosch, Kleanthis Avramidis, Tiantian Feng, Jihwan Lee, Monica Gonzalez-Machorro, Shrikanth Narayanan, Björn W. Schuller

Abstract

The expression of affect is integral to spoken communication, yet, its link to underlying articulatory execution remains unclear. Measures of articulatory muscle activity such as EMG could reveal how speech production is modulated by emotion alongside acoustic speech analyses. We investigate affect decoding from facial and neck surface electromyography (sEMG) during phonated and silent speech production. For this purpose, we introduce a dataset comprising 2,780 utterances from 12 participants across 3 tasks, on which we evaluate both intra- and inter-subject decoding using a range of features and model embeddings. Our results reveal that EMG representations reliably discriminate frustration with up to 0.845 AUC, and generalize well across articulation modes. Our ablation study further demonstrates that affective signatures are embedded in facial motor activity and persist in the absence of phonation, highlighting the potential of EMG sensing for affect-aware silent speech interfaces.

Affect Decoding in Phonated and Silent Speech Production from Surface EMG

Abstract

The expression of affect is integral to spoken communication, yet, its link to underlying articulatory execution remains unclear. Measures of articulatory muscle activity such as EMG could reveal how speech production is modulated by emotion alongside acoustic speech analyses. We investigate affect decoding from facial and neck surface electromyography (sEMG) during phonated and silent speech production. For this purpose, we introduce a dataset comprising 2,780 utterances from 12 participants across 3 tasks, on which we evaluate both intra- and inter-subject decoding using a range of features and model embeddings. Our results reveal that EMG representations reliably discriminate frustration with up to 0.845 AUC, and generalize well across articulation modes. Our ablation study further demonstrates that affective signatures are embedded in facial motor activity and persist in the absence of phonation, highlighting the potential of EMG sensing for affect-aware silent speech interfaces.
Paper Structure (26 sections, 2 equations, 6 figures, 9 tables)

This paper contains 26 sections, 2 equations, 6 figures, 9 tables.

Figures (6)

  • Figure 1: Conceptual overview of the study. We present a dataset and computational analysis on EMG-based affect decoding during phonated and silent speech production. During articulation, surface EMG from neck and facial muscles was recorded alongside audio speech. Note: The schematic human illustration was generated with AI assistance for visualization purposes and is not meant to reflect the exact sensor hardware design, number of channels, or placement used in the study.
  • Figure 2: Annotation results for Task 2A (designed to induce politeness) and Task 2B (designed to induce frustration). Individual trial annotations are overlaid to the boxplots, pooled across the 3 annotators. Inter-annotator agreement is included in terms of Krippendorff's alpha. Light jittering is applied to the integer annotation values for visualization purposes.
  • Figure 3: Comparison of intra-subject AUC between Task 1 and Task 3 across speaking conditions. Left: tested on all sentences. Right: tested on the repeated sentences (see also Table \ref{['tab:repeated']}). Dots correspond to average individual performance.
  • Figure 4: Channel-wise decoding performance across evaluation settings (RQ1). Topographic visualization of electrode-specific AUC for the intra- (left) and inter-subject (right) settings in Tasks 1, 3. Each marker corresponds to an EMG channel, with warmer colors reflecting higher discriminability.
  • Figure 5: Channel-wise decoding performance across articulation conditions (RQ2). Topographic visualization of electrode-specific AUC for the phonated (left) and silent (right) conditions in Tasks 1 and 3. Each numbered marker corresponds to an EMG channel, and color indicates intra-subject AUC, with warmer colors reflecting higher discriminability.
  • ...and 1 more figures