Table of Contents
Fetching ...

Evaluating Lexicon Incorporation for Depression Symptom Estimation

Kirill Milintsevich, Gaël Dias, Kairit Sirts

TL;DR

This work probes whether enriching transformer-based depression symptom estimation with external lexicon knowledge improves performance. It introduces an input-marking strategy to inject sentiment, emotion, and domain-specific signals from AFINN, NRC, and SDD without modifying model architecture. Results show clear gains on DAIC-WOZ, particularly with NRC and AFINN, and state-of-the-art PHQ-8 estimation with MeBERT when using all lexicons, while PRIMATE yields more modest and inconsistent improvements. The findings suggest lexicon alignment with the target task is crucial, and highlight data scarcity and dataset differences as key factors limiting generalization, underscoring the need for larger clinical corpora and ethical considerations in deployment.

Abstract

This paper explores the impact of incorporating sentiment, emotion, and domain-specific lexicons into a transformer-based model for depression symptom estimation. Lexicon information is added by marking the words in the input transcripts of patient-therapist conversations as well as in social media posts. Overall results show that the introduction of external knowledge within pre-trained language models can be beneficial for prediction performance, while different lexicons show distinct behaviours depending on the targeted task. Additionally, new state-of-the-art results are obtained for the estimation of depression level over patient-therapist interviews.

Evaluating Lexicon Incorporation for Depression Symptom Estimation

TL;DR

This work probes whether enriching transformer-based depression symptom estimation with external lexicon knowledge improves performance. It introduces an input-marking strategy to inject sentiment, emotion, and domain-specific signals from AFINN, NRC, and SDD without modifying model architecture. Results show clear gains on DAIC-WOZ, particularly with NRC and AFINN, and state-of-the-art PHQ-8 estimation with MeBERT when using all lexicons, while PRIMATE yields more modest and inconsistent improvements. The findings suggest lexicon alignment with the target task is crucial, and highlight data scarcity and dataset differences as key factors limiting generalization, underscoring the need for larger clinical corpora and ethical considerations in deployment.

Abstract

This paper explores the impact of incorporating sentiment, emotion, and domain-specific lexicons into a transformer-based model for depression symptom estimation. Lexicon information is added by marking the words in the input transcripts of patient-therapist conversations as well as in social media posts. Overall results show that the introduction of external knowledge within pre-trained language models can be beneficial for prediction performance, while different lexicons show distinct behaviours depending on the targeted task. Additionally, new state-of-the-art results are obtained for the estimation of depression level over patient-therapist interviews.
Paper Structure (9 sections, 1 equation, 2 figures, 4 tables)

This paper contains 9 sections, 1 equation, 2 figures, 4 tables.

Figures (2)

  • Figure 1: Overview of the model architecture. $U_i^N$ stands for $i$-th utterance of $N$-th input. Symptom Scores are $||L||$ real numbers, where $||L||$ is the number of symptoms to predict.
  • Figure 2: Average predicted values for depressed and non-depressed patients of the DAIC-WOZ test set.