Table of Contents
Fetching ...

Interpretable Early Detection of Parkinson's Disease through Speech Analysis

Lorenzo Simone, Mauro Giuseppe Camporeale, Vito Marco Rubino, Vincenzo Gervasi, Giovanni Dimauro

TL;DR

This work targets early Parkinson's disease detection from speech by employing a temporal CNN that analyzes speech segmented into word chunks, combined with a 1D Grad-CAM-based interpretability mechanism to highlight predictive segments. The approach achieves competitive performance relative to traditional classifiers while enhancing interpretability through segment-level attributions, revealing phonetic markers linked to PD. Evaluations on the Italian Parkinson's Voice and Speech Database demonstrate significant improvements in accuracy, recall, and F1-score, with heatmap-guided insights into speech segments driving predictions. The study offers a reproducible, explainable framework for temporally-aware speech-based PD detection with potential for cross-language generalization and personalized monitoring.

Abstract

Parkinson's disease is a progressive neurodegenerative disorder affecting motor and non-motor functions, with speech impairments among its earliest symptoms. Speech impairments offer a valuable diagnostic opportunity, with machine learning advances providing promising tools for timely detection. In this research, we propose a deep learning approach for early Parkinson's disease detection from speech recordings, which also highlights the vocal segments driving predictions to enhance interpretability. This approach seeks to associate predictive speech patterns with articulatory features, providing a basis for interpreting underlying neuromuscular impairments. We evaluated our approach using the Italian Parkinson's Voice and Speech Database, containing 831 audio recordings from 65 participants, including both healthy individuals and patients. Our approach showed competitive classification performance compared to state-of-the-art methods, while providing enhanced interpretability by identifying key speech features influencing predictions.

Interpretable Early Detection of Parkinson's Disease through Speech Analysis

TL;DR

This work targets early Parkinson's disease detection from speech by employing a temporal CNN that analyzes speech segmented into word chunks, combined with a 1D Grad-CAM-based interpretability mechanism to highlight predictive segments. The approach achieves competitive performance relative to traditional classifiers while enhancing interpretability through segment-level attributions, revealing phonetic markers linked to PD. Evaluations on the Italian Parkinson's Voice and Speech Database demonstrate significant improvements in accuracy, recall, and F1-score, with heatmap-guided insights into speech segments driving predictions. The study offers a reproducible, explainable framework for temporally-aware speech-based PD detection with potential for cross-language generalization and personalized monitoring.

Abstract

Parkinson's disease is a progressive neurodegenerative disorder affecting motor and non-motor functions, with speech impairments among its earliest symptoms. Speech impairments offer a valuable diagnostic opportunity, with machine learning advances providing promising tools for timely detection. In this research, we propose a deep learning approach for early Parkinson's disease detection from speech recordings, which also highlights the vocal segments driving predictions to enhance interpretability. This approach seeks to associate predictive speech patterns with articulatory features, providing a basis for interpreting underlying neuromuscular impairments. We evaluated our approach using the Italian Parkinson's Voice and Speech Database, containing 831 audio recordings from 65 participants, including both healthy individuals and patients. Our approach showed competitive classification performance compared to state-of-the-art methods, while providing enhanced interpretability by identifying key speech features influencing predictions.

Paper Structure

This paper contains 3 sections, 3 equations, 1 figure, 2 tables.

Figures (1)

  • Figure 1: Activation contributions for each audio segment, depicting differences between healthy (left) and Parkinson's disease (right) speech patterns.