Leveraging LLMs for Translating and Classifying Mental Health Data
Konstantinos Skianis, A. Seza Doğruöz, John Pavlopoulos
TL;DR
This study probes the feasibility of using large language models (LLMs) to detect depression severity from English user-generated posts and their automatic Greek translations. It translates English text to Greek with GPT-3.5-turbo and then prompts the model to assign four severity levels, evaluating performance on the DepSeverity Reddit dataset. Results show modest English performance (best F1 around 0.25 for minimal and 0.22 for severe) and pronounced degradation after translation to Greek, with mild depression being particularly challenging (F1 ≈0.06). The findings underscore the need for caution in deploying LLM-based mental-health tools across languages, highlight the importance of human supervision, and point to the value of developing multilingual, resource-aware approaches for clinical support and professional training.
Abstract
Large language models (LLMs) are increasingly used in medical fields. In mental health support, the early identification of linguistic markers associated with mental health conditions can provide valuable support to mental health professionals, and reduce long waiting times for patients. Despite the benefits of LLMs for mental health support, there is limited research on their application in mental health systems for languages other than English. Our study addresses this gap by focusing on the detection of depression severity in Greek through user-generated posts which are automatically translated from English. Our results show that GPT3.5-turbo is not very successful in identifying the severity of depression in English, and it has a varying performance in Greek as well. Our study underscores the necessity for further research, especially in languages with less resources. Also, careful implementation is necessary to ensure that LLMs are used effectively in mental health platforms, and human supervision remains crucial to avoid misdiagnosis.
