Table of Contents
Fetching ...

Voice Biomarker Analysis and Automated Severity Classification of Dysarthric Speech in a Multilingual Context

Eunjung Yeo

TL;DR

This work addresses automatic severity classification of dysarthric speech in a multilingual setting by integrating language-universal and language-specific voice biomarkers across English, Korean, and Tamil. It proposes an architecture that combines handcrafted, clinically informed features with an XGBoost classifier, validated via two-step statistical and clinical procedures to avoid data leakage. The results show that incorporating language-specific features yields improvements over traditional multilingual baselines, achieving parity with or surpassing monolingual models on average across languages. The study paves the way for equitable, cross-language dysarthria assessment and outlines concrete directions for robust multilingual models and balanced datasets.

Abstract

Dysarthria, a motor speech disorder, severely impacts voice quality, pronunciation, and prosody, leading to diminished speech intelligibility and reduced quality of life. Accurate assessment is crucial for effective treatment, but traditional perceptual assessments are limited by their subjectivity and resource intensity. To mitigate the limitations, automatic dysarthric speech assessment methods have been proposed to support clinicians on their decision-making. While these methods have shown promising results, most research has focused on monolingual environments. However, multilingual approaches are necessary to address the global burden of dysarthria and ensure equitable access to accurate diagnosis. This thesis proposes a novel multilingual dysarthria severity classification method, by analyzing three languages: English, Korean, and Tamil.

Voice Biomarker Analysis and Automated Severity Classification of Dysarthric Speech in a Multilingual Context

TL;DR

This work addresses automatic severity classification of dysarthric speech in a multilingual setting by integrating language-universal and language-specific voice biomarkers across English, Korean, and Tamil. It proposes an architecture that combines handcrafted, clinically informed features with an XGBoost classifier, validated via two-step statistical and clinical procedures to avoid data leakage. The results show that incorporating language-specific features yields improvements over traditional multilingual baselines, achieving parity with or surpassing monolingual models on average across languages. The study paves the way for equitable, cross-language dysarthria assessment and outlines concrete directions for robust multilingual models and balanced datasets.

Abstract

Dysarthria, a motor speech disorder, severely impacts voice quality, pronunciation, and prosody, leading to diminished speech intelligibility and reduced quality of life. Accurate assessment is crucial for effective treatment, but traditional perceptual assessments are limited by their subjectivity and resource intensity. To mitigate the limitations, automatic dysarthric speech assessment methods have been proposed to support clinicians on their decision-making. While these methods have shown promising results, most research has focused on monolingual environments. However, multilingual approaches are necessary to address the global burden of dysarthria and ensure equitable access to accurate diagnosis. This thesis proposes a novel multilingual dysarthria severity classification method, by analyzing three languages: English, Korean, and Tamil.

Paper Structure

This paper contains 21 sections, 1 figure, 8 tables.

Figures (1)

  • Figure 11: Classification performances using different ratio of Tamil datasets.