Leveraging LLMs for Mental Health: Detection and Recommendations from Social Discussions
Vaishali Aggarwal, Sachin Thukral, Krushil Patel, Arnab Chatterjee
TL;DR
The paper tackles the challenge of identifying mental health issues from social media by proposing a hybrid framework that blends domain-adapted NLP models with large language models (LLMs) to detect disorders, assess severity, and generate intervention recommendations from Reddit posts. It leverages a multi-stage pipeline including binary relevance filtering, disorder classification across nine SMHD-derived labels, severity categorization, and a therapy/behavior-change recommendation module, evaluated using a 5-fold cross-validation on a 5,000-post subset of a larger Reddit dataset. Results show that dictionary-based labeling offers stability but lacks nuanced detection, while LLMs capture richer multi-label information with performance varying by annotator and task; time complexity analyses reveal clear trade-offs between model capability and computational cost. The framework advances mental health informatics by enabling early detection and personalized digital health interventions, with careful attention to ethics, safety, and human oversight in real-world deployments.
Abstract
Textual data from social platforms captures various aspects of mental health through discussions around and across issues, while users reach out for help and others sympathize and offer support. We propose a comprehensive framework that leverages Natural Language Processing (NLP) and Generative AI techniques to identify and assess mental health disorders, detect their severity, and create recommendations for behavior change and therapeutic interventions based on users' posts on Reddit. To classify the disorders, we use rule-based labeling methods as well as advanced pre-trained NLP models to extract nuanced semantic features from the data. We fine-tune domain-adapted and generic pre-trained NLP models based on predictions from specialized Large Language Models (LLMs) to improve classification accuracy. Our hybrid approach combines the generalization capabilities of pre-trained models with the domain-specific insights captured by LLMs, providing an improved understanding of mental health discourse. Our findings highlight the strengths and limitations of each model, offering valuable insights into their practical applicability. This research potentially facilitates early detection and personalized care to aid practitioners and aims to facilitate timely interventions and improve overall well-being, thereby contributing to the broader field of mental health surveillance and digital health analytics.
