Table of Contents
Fetching ...

A Transversal Study of Fundamental Frequency Contours in Parkinsonian Voices

Pablo Rodriguez-Perez, Ruben Fraile, Miguel Garcia-Escrig, Nicolas Saenz-Lechon, Juana M. Gutierrez-Arriola, Victor Osma-Ruiz

TL;DR

The paper addresses how Parkinson's disease alters speech prosody, focusing on fundamental frequency contours during read speech. It combines traditional $f_o$ statistics with a modulation-spectrum analysis to capture slow and fast pitch variations, and it accounts for inter-subject variability by analyzing men and women separately. The study finds that relative pitch range $\sigma_{f_o}/\mu_{f_o}$ correlates with PD stage, especially in women, while modulation-band energies ($LFER$, $MFER$, $HFER$) also relate to disease progression; however, regression models achieve modest explanatory power ($R^2$ about 0.37 overall, higher in women, lower in men) and yield only moderate diagnostic utility (AUC ~0.634, EER ~36%). Overall, intonation descriptors offer some PD-relevant information but are insufficient for reliable early diagnosis without additional markers, and they underscore the need for sex-specific analyses in PD voice studies.

Abstract

A transversal study of the pitch variability of parkinsonian voices in read speech is presented. 30 patients suffering from Parkinson's disease (PD) and 32 healthy speakers were recorded while reading a text without voiceless phonemes. The fundamental frequency contours were calculated from the recordings, and the following measures were used for describing them: mean, minimum, maximum, and standard deviation of the estimated fundamental frequencies. Results based on these measures indicate that the influence of PD on some aspects of intonation can be masked by the effects of aging, especially for male voices. However, some parameters such as the relative fundamental frequency range exhibit lower correlations with age than with PD stage, as evaluated using the Hoehn and Yahr scale. These correlations between relative fundamental frequency range and PD stage reach moderate-to-high values in the case of women. Additionally, three parameters describing the form of the fundamental frequency modulation spectrum were investigated for correlation with age and PD stage. The study of this modulation spectrum provides some insight into the ability of the speakers to plan the intonation of full phrases. For both male and female populations, significant correlations were found between parameters obtained from the modulation spectrum of fundamental frequency and the PD stage. Nevertheless, the quantitative assessment of the performance of regression models built from these modulation parameters and fundamental frequency range suggests that such measures are likely to be of limited value in the early diagnosis of PD due to inter-speaker variability.

A Transversal Study of Fundamental Frequency Contours in Parkinsonian Voices

TL;DR

The paper addresses how Parkinson's disease alters speech prosody, focusing on fundamental frequency contours during read speech. It combines traditional statistics with a modulation-spectrum analysis to capture slow and fast pitch variations, and it accounts for inter-subject variability by analyzing men and women separately. The study finds that relative pitch range correlates with PD stage, especially in women, while modulation-band energies (, , ) also relate to disease progression; however, regression models achieve modest explanatory power ( about 0.37 overall, higher in women, lower in men) and yield only moderate diagnostic utility (AUC ~0.634, EER ~36%). Overall, intonation descriptors offer some PD-relevant information but are insufficient for reliable early diagnosis without additional markers, and they underscore the need for sex-specific analyses in PD voice studies.

Abstract

A transversal study of the pitch variability of parkinsonian voices in read speech is presented. 30 patients suffering from Parkinson's disease (PD) and 32 healthy speakers were recorded while reading a text without voiceless phonemes. The fundamental frequency contours were calculated from the recordings, and the following measures were used for describing them: mean, minimum, maximum, and standard deviation of the estimated fundamental frequencies. Results based on these measures indicate that the influence of PD on some aspects of intonation can be masked by the effects of aging, especially for male voices. However, some parameters such as the relative fundamental frequency range exhibit lower correlations with age than with PD stage, as evaluated using the Hoehn and Yahr scale. These correlations between relative fundamental frequency range and PD stage reach moderate-to-high values in the case of women. Additionally, three parameters describing the form of the fundamental frequency modulation spectrum were investigated for correlation with age and PD stage. The study of this modulation spectrum provides some insight into the ability of the speakers to plan the intonation of full phrases. For both male and female populations, significant correlations were found between parameters obtained from the modulation spectrum of fundamental frequency and the PD stage. Nevertheless, the quantitative assessment of the performance of regression models built from these modulation parameters and fundamental frequency range suggests that such measures are likely to be of limited value in the early diagnosis of PD due to inter-speaker variability.
Paper Structure (13 sections, 5 equations, 4 figures, 4 tables)

This paper contains 13 sections, 5 equations, 4 figures, 4 tables.

Figures (4)

  • Figure 1: Scatter plots showing the relation between actual H&Y labels and those by the regression model for the overall population being studied (upper plot), and differentiated for men (lower left plot) and women (lower right). The variables included in the regression model were $LFER$, $MFER$, and $\frac{\sigma_{f_\mathrm{o}}}{\mu_{f_\mathrm{o}}}$.
  • Figure 2: Experimental CDF of the H&Y labels produced by the regression model for PD patients (grey); and complementary CDF of labels produced by the same model for the control group (black). The dashed lines indicate the 99% confidence intervals, calculated as proposed in Higg04.
  • Figure 3: ROC curve corresponding to the detection of PD using the regression model mentioned in the text. The points correspond to the actual results obtained with the dataset. The grey line corresponds to a local averaging of the points. The area under the line equals 0.634, which corresponds to an estimation of the $AUC$.
  • Figure 4: Fundamental frequency contours (thick lines) and their components with modulation frequencies up to 6 Hz (thin lines) for two voices with different values of the $LFER$. The low-frequency components were estimated by calculating the DFT of the fundamental frequency contour, zeroing all its components corresponding to frequencies above 6 Hz, and going back to temporal domain by computing the inverse DFT. The non-voiced intervals were managed by using the DFT for non-uniformly spaced samples instead of the standard DFT, as in FSOG15.