Table of Contents
Fetching ...

Clinically Inspired Symptom-Guided Depression Detection from Emotion-Aware Speech Representations

Chaithra Nerella, Chiranjeevi Yarra

TL;DR

This work proposes a symptom-specific and clinically inspired framework for depression severity estimation from speech that uses a symptom-guided cross-attention mechanism and introduces a learnable symptom-specific parameter that adaptively controls the sharpness of attention distributions.

Abstract

Depression manifests through a diverse set of symptoms such as sleep disturbance, loss of interest, and concentration difficulties. However, most existing works treat depression prediction either as a binary label or an overall severity score without explicitly modeling symptom-specific information. This limits their ability to provide symptom-level analysis relevant to clinical screening. To address this, we propose a symptom-specific and clinically inspired framework for depression severity estimation from speech. Our approach uses a symptom-guided cross-attention mechanism that aligns PHQ-8 questionnaire items with emotion-aware speech representations to identify which segments of a participant's speech are more important to each symptom. To account for differences in how symptoms are expressed over time, we introduce a learnable symptom-specific parameter that adaptively controls the sharpness of attention distributions. Our results on EDAIC, a standard clinical-style dataset, demonstrate improved performance outperforming prior works. Further, analyzing the attention distributions showed that higher attention is assigned to utterances containing cues related to multiple depressive symptoms, highlighting the interpretability of our approach. These findings outline the importance of symptom-guided and emotion-aware modeling for speech-based depression screening.

Clinically Inspired Symptom-Guided Depression Detection from Emotion-Aware Speech Representations

TL;DR

This work proposes a symptom-specific and clinically inspired framework for depression severity estimation from speech that uses a symptom-guided cross-attention mechanism and introduces a learnable symptom-specific parameter that adaptively controls the sharpness of attention distributions.

Abstract

Depression manifests through a diverse set of symptoms such as sleep disturbance, loss of interest, and concentration difficulties. However, most existing works treat depression prediction either as a binary label or an overall severity score without explicitly modeling symptom-specific information. This limits their ability to provide symptom-level analysis relevant to clinical screening. To address this, we propose a symptom-specific and clinically inspired framework for depression severity estimation from speech. Our approach uses a symptom-guided cross-attention mechanism that aligns PHQ-8 questionnaire items with emotion-aware speech representations to identify which segments of a participant's speech are more important to each symptom. To account for differences in how symptoms are expressed over time, we introduce a learnable symptom-specific parameter that adaptively controls the sharpness of attention distributions. Our results on EDAIC, a standard clinical-style dataset, demonstrate improved performance outperforming prior works. Further, analyzing the attention distributions showed that higher attention is assigned to utterances containing cues related to multiple depressive symptoms, highlighting the interpretability of our approach. These findings outline the importance of symptom-guided and emotion-aware modeling for speech-based depression screening.
Paper Structure (14 sections, 1 equation, 3 figures, 3 tables)

This paper contains 14 sections, 1 equation, 3 figures, 3 tables.

Figures (3)

  • Figure 1: Proposed approach inspired from PHQ-8 based clinial practice
  • Figure 2: Block diagram of the overall framework
  • Figure 3: Visualization of attention heatmap and text alignment of the sentences marked with 'x'