Table of Contents
Fetching ...

Explainable Depression Symptom Detection in Social Media

Eliseo Bao, Anxo Pérez, Javier Parapar

TL;DR

This work tackles the explainability gap in simulating depression risk detection from social media by grounding predictions in validated depressive symptom markers from the BDI-II framework. It compares two pipeline paradigms—a single-step text-to-text model that jointly classifies and explains, and a two-step approach that separates classification from explanation—along with in-context learning using LLMs. Explanations are extractive, symptom-grounded rationales validated through offline metrics and expert evaluation on PsySym and BDI-Sen datasets, showing strong classification performance without sacrificing interpretability. The study demonstrates the clinical usefulness of symptom-based explanations, analyzes data-efficiency, and discusses the role of domain-tuned LLMs, while outlining ethical and generalization considerations and avenues for temporal and cross-platform extension.

Abstract

Users of social platforms often perceive these sites as supportive spaces to post about their mental health issues. Those conversations contain important traces about individuals' health risks. Recently, researchers have exploited this online information to construct mental health detection models, which aim to identify users at risk on platforms like Twitter, Reddit or Facebook. Most of these models are centred on achieving good classification results, ignoring the explainability and interpretability of the decisions. Recent research has pointed out the importance of using clinical markers, such as the use of symptoms, to improve trust in the computational models by health professionals. In this paper, we propose using transformer-based architectures to detect and explain the appearance of depressive symptom markers in the users' writings. We present two approaches: i) train a model to classify, and another one to explain the classifier's decision separately and ii) unify the two tasks simultaneously using a single model. Additionally, for this latter manner, we also investigated the performance of recent conversational LLMs when using in-context learning. Our natural language explanations enable clinicians to interpret the models' decisions based on validated symptoms, enhancing trust in the automated process. We evaluate our approach using recent symptom-based datasets, employing both offline and expert-in-the-loop metrics to assess the quality of the explanations generated by our models. The experimental results show that it is possible to achieve good classification results while generating interpretable symptom-based explanations.

Explainable Depression Symptom Detection in Social Media

TL;DR

This work tackles the explainability gap in simulating depression risk detection from social media by grounding predictions in validated depressive symptom markers from the BDI-II framework. It compares two pipeline paradigms—a single-step text-to-text model that jointly classifies and explains, and a two-step approach that separates classification from explanation—along with in-context learning using LLMs. Explanations are extractive, symptom-grounded rationales validated through offline metrics and expert evaluation on PsySym and BDI-Sen datasets, showing strong classification performance without sacrificing interpretability. The study demonstrates the clinical usefulness of symptom-based explanations, analyzes data-efficiency, and discusses the role of domain-tuned LLMs, while outlining ethical and generalization considerations and avenues for temporal and cross-platform extension.

Abstract

Users of social platforms often perceive these sites as supportive spaces to post about their mental health issues. Those conversations contain important traces about individuals' health risks. Recently, researchers have exploited this online information to construct mental health detection models, which aim to identify users at risk on platforms like Twitter, Reddit or Facebook. Most of these models are centred on achieving good classification results, ignoring the explainability and interpretability of the decisions. Recent research has pointed out the importance of using clinical markers, such as the use of symptoms, to improve trust in the computational models by health professionals. In this paper, we propose using transformer-based architectures to detect and explain the appearance of depressive symptom markers in the users' writings. We present two approaches: i) train a model to classify, and another one to explain the classifier's decision separately and ii) unify the two tasks simultaneously using a single model. Additionally, for this latter manner, we also investigated the performance of recent conversational LLMs when using in-context learning. Our natural language explanations enable clinicians to interpret the models' decisions based on validated symptoms, enhancing trust in the automated process. We evaluate our approach using recent symptom-based datasets, employing both offline and expert-in-the-loop metrics to assess the quality of the explanations generated by our models. The experimental results show that it is possible to achieve good classification results while generating interpretable symptom-based explanations.
Paper Structure (34 sections, 1 equation, 5 figures, 5 tables)

This paper contains 34 sections, 1 equation, 5 figures, 5 tables.

Figures (5)

  • Figure 1: Overall pipeline of our proposals for the classification and generation of natural language explanations for the presence of depression symptom information in social media posts.
  • Figure 2: Confusion matrices showing the predictions accuracy for the proposed settings and WT5, WBART, MBERT and GPT-3.5 systems.
  • Figure 3: Offline metrics in relation to the quantity of samples kept in the training data for the M-M setting
  • Figure 4: External dataset validation incorporating DepreSym training samples.
  • Figure 5: Prompt used for our experiments with conversational LLMs.