The order in speech disorder: a scoping review of state of the art machine learning methods for clinical speech classification
Birger Moell, Fredrik Sand Aronsson, Per Östberg, Jonas Beskow
TL;DR
This scoping review surveys state-of-the-art machine learning methods that leverage speech for clinical classification across neurological, laryngeal, and mental disorders. It synthesizes 91 included studies from 564 identified articles, highlighting strong diagnostic performance for Parkinson's disease, dysarthria, and laryngeal pathologies, with substantial but variable results for depression, schizophrenia, MCI, and AD. The review underscores practical considerations for clinical translation, including data collection quality, explainability, ecological validity, and the need for multiclass, multimodal, real-time, and privacy-preserving approaches. While results are promising, especially for speech-derived biomarkers, widespread clinical adoption requires addressing biases (e.g., gender), overfitting risks, and ensuring interpretability and ethical use. Overall, speech-based ML diagnostics hold significant potential to augment clinical assessment, but require rigorous validation and thoughtful integration into care pathways.
Abstract
Background:Speech patterns have emerged as potential diagnostic markers for conditions with varying etiologies. Machine learning (ML) presents an opportunity to harness these patterns for accurate disease diagnosis. Objective: This review synthesized findings from studies exploring ML's capability in leveraging speech for the diagnosis of neurological, laryngeal and mental disorders. Methods: A systematic examination of 564 articles was conducted with 91 articles included in the study, which encompassed a wide spectrum of conditions, ranging from voice pathologies to mental and neurological disorders. Methods for speech classifications were assessed based on the relevant studies and scored between 0-10 based on the reported diagnostic accuracy of their ML models. Results: High diagnostic accuracies were consistently observed for laryngeal disorders, dysarthria, and changes related to speech in Parkinsons disease. These findings indicate the robust potential of speech as a diagnostic tool. Disorders like depression, schizophrenia, mild cognitive impairment and Alzheimers dementia also demonstrated high accuracies, albeit with some variability across studies. Meanwhile, disorders like OCD and autism highlighted the need for more extensive research to ascertain the relationship between speech patterns and the respective conditions. Conclusion: ML models utilizing speech patterns demonstrate promising potential in diagnosing a range of mental, laryngeal, and neurological disorders. However, the efficacy varies across conditions, and further research is needed. The integration of these models into clinical practice could potentially revolutionize the evaluation and diagnosis of a number of different medical conditions.
