Table of Contents
Fetching ...

Exploring Gender-Specific Speech Patterns in Automatic Suicide Risk Assessment

Maurice Gerczuk, Shahin Amiriparian, Justina Lutz, Wolfgang Strube, Irina Papazova, Alkomiet Hasan, Björn W. Schuller

TL;DR

This study tackles automatic suicide risk assessment from speech in emergency medicine, with an emphasis on gender-specific speech markers. It builds a small clinical dataset and compares handcrafted audio functionals with transformer-based wav2vec2.0 representations, using phrase-level normalization and gender-based modelling. The main finding is that emotion-tuned wav2vec2.0 features paired with gender-exclusive modelling yield up to 81% balanced accuracy at the subject level, with softer weighting (λ=0.1) giving 74% in some settings; male and female speech show opposite associations with arousal and other affective dimensions. The work demonstrates the feasibility of transformer-based approaches on small clinical datasets and highlights the need for larger, more diverse data to validate gender-specific acoustic markers for suicide risk.

Abstract

In emergency medicine, timely intervention for patients at risk of suicide is often hindered by delayed access to specialised psychiatric care. To bridge this gap, we introduce a speech-based approach for automatic suicide risk assessment. Our study involves a novel dataset comprising speech recordings of 20 patients who read neutral texts. We extract four speech representations encompassing interpretable and deep features. Further, we explore the impact of gender-based modelling and phrase-level normalisation. By applying gender-exclusive modelling, features extracted from an emotion fine-tuned wav2vec2.0 model can be utilised to discriminate high- from low- suicide risk with a balanced accuracy of 81%. Finally, our analysis reveals a discrepancy in the relationship of speech characteristics and suicide risk between female and male subjects. For men in our dataset, suicide risk increases together with agitation while voice characteristics of female subjects point the other way.

Exploring Gender-Specific Speech Patterns in Automatic Suicide Risk Assessment

TL;DR

This study tackles automatic suicide risk assessment from speech in emergency medicine, with an emphasis on gender-specific speech markers. It builds a small clinical dataset and compares handcrafted audio functionals with transformer-based wav2vec2.0 representations, using phrase-level normalization and gender-based modelling. The main finding is that emotion-tuned wav2vec2.0 features paired with gender-exclusive modelling yield up to 81% balanced accuracy at the subject level, with softer weighting (λ=0.1) giving 74% in some settings; male and female speech show opposite associations with arousal and other affective dimensions. The work demonstrates the feasibility of transformer-based approaches on small clinical datasets and highlights the need for larger, more diverse data to validate gender-specific acoustic markers for suicide risk.

Abstract

In emergency medicine, timely intervention for patients at risk of suicide is often hindered by delayed access to specialised psychiatric care. To bridge this gap, we introduce a speech-based approach for automatic suicide risk assessment. Our study involves a novel dataset comprising speech recordings of 20 patients who read neutral texts. We extract four speech representations encompassing interpretable and deep features. Further, we explore the impact of gender-based modelling and phrase-level normalisation. By applying gender-exclusive modelling, features extracted from an emotion fine-tuned wav2vec2.0 model can be utilised to discriminate high- from low- suicide risk with a balanced accuracy of 81%. Finally, our analysis reveals a discrepancy in the relationship of speech characteristics and suicide risk between female and male subjects. For men in our dataset, suicide risk increases together with agitation while voice characteristics of female subjects point the other way.
Paper Structure (13 sections, 2 figures, 1 table)

This paper contains 13 sections, 2 figures, 1 table.

Figures (2)

  • Figure 1: Distribution of normalised arousal, dominance, and valence predictions generated by the pre-trained w2v speech emotion recognition model Wagner23-DOT in subjects with high and low suicidal risk additionally separated by gender.
  • Figure 2: Distribution of most important features determined by effect size of Mann-Whitney U test for low and high suicidality. Additionally split by gender of speaker.