Automated Measurement of Geniohyoid Muscle Thickness During Speech Using Deep Learning and Ultrasound

Alisher Myrgyyassov; Bruce Xiao Wang; Yu Sun; Shuming Huang; Zhen Song; Min Ney Wong; Yongping Zheng

Automated Measurement of Geniohyoid Muscle Thickness During Speech Using Deep Learning and Ultrasound

Alisher Myrgyyassov, Bruce Xiao Wang, Yu Sun, Shuming Huang, Zhen Song, Min Ney Wong, Yongping Zheng

TL;DR

SMMA achieves expert-validated accuracy while eliminating the need for manual annotation, enabling scalable investigations of speech motor control and objective assessment of speech and swallowing disorders.

Abstract

Manual measurement of muscle morphology from ultrasound during speech is time-consuming and limits large-scale studies. We present SMMA, a fully automated framework that combines deep-learning segmentation with skeleton-based thickness quantification to analyze geniohyoid (GH) muscle dynamics. Validation demonstrates near-human-level accuracy (Dice = 0.9037, MAE = 0.53 mm, r = 0.901). Application to Cantonese vowel production (N = 11) reveals systematic patterns: /a:/ shows significantly greater GH thickness (7.29 mm) than /i:/ (5.95 mm, p < 0.001, Cohen's d > 1.3), suggesting greater GH activation during production of /a:/ than /i:/, consistent with its role in mandibular depression. Sex differences (5-8% greater in males) reflect anatomical scaling. SMMA achieves expert-validated accuracy while eliminating the need for manual annotation, enabling scalable investigations of speech motor control and objective assessment of speech and swallowing disorders.

Automated Measurement of Geniohyoid Muscle Thickness During Speech Using Deep Learning and Ultrasound

TL;DR

Abstract

Paper Structure (16 sections, 2 equations, 3 figures, 2 tables)

This paper contains 16 sections, 2 equations, 3 figures, 2 tables.

Introduction
Methodology
Overview
Component 1: Segmentation
Component 2: Thickness Extraction
Datasets and Data Collection Protocols
Validation
Applications
Results
Component 1 Validation
Component 2 Validation
Application in Isolated Vowel Analysis
Discussion
Conclusion
Acknowledgements
...and 1 more sections

Figures (3)

Figure 1: Visualization of the SMMA pipeline. Image (a) shows the original image and probe placement, (b) an automatically generated mask by UltraUNet, and (c) shows the middle 50% skeleton extracted from the mask and corresponding thickness measurements in pixels (px) and millimetres (mm).
Figure 2: Component 2 thickness measurement validation: SMMA automated measurements (left) vs. sonographer ground truth annotations (right).
Figure 3: Representative 30-second sample from the full recording of continuous muscle thickness tracking on a female subject during speech production. Subject produces repeated /a/ - /i/ - /u/ isolated vowel sequences (purple = /a/ , red = /i/ , green = /u/ , white = pause), each vowel lasting around 800 ms. Brief gaps in tracking occur when image quality temporarily degrades (e.g., 75s), demonstrating algorithm behavior under naturalistic recording conditions.

Automated Measurement of Geniohyoid Muscle Thickness During Speech Using Deep Learning and Ultrasound

TL;DR

Abstract

Automated Measurement of Geniohyoid Muscle Thickness During Speech Using Deep Learning and Ultrasound

Authors

TL;DR

Abstract

Table of Contents

Figures (3)