Table of Contents
Fetching ...

A Dual-Prompting for Interpretable Mental Health Language Models

Hyolim Jeon, Dongje Yoo, Daeun Lee, Sejung Son, Seungbae Kim, Jinyoung Han

TL;DR

The paper tackles the interpretability challenge of LLM-based mental health analysis by introducing a dual-prompting framework that couples knowledge-aware evidence extraction with expert-identity prompts and a suicide dictionary, with an extract-then-generate summarization pipeline guided by a consistency evaluator. It leverages MentaLLaMA for domain-specific prompting and SOLAR for consistency-based selection, aiming to provide clinicians with traceable evidence across posts. Experiments on the CLPsych 2024 dataset show a modest but meaningful improvement in extraction recall when using few-shot examples and domain knowledge, while findings indicate general-domain LLMs can outperform domain-tuned models for summarization when paired with a consistency evaluator. The approach advances interpretable, evidence-backed mental-state monitoring, though it acknowledges limitations such as lack of ground truth and reliance on small-scale models, guiding future work toward broader prompts, more evaluators, and domain expansion.

Abstract

Despite the increasing demand for AI-based mental health monitoring tools, their practical utility for clinicians is limited by the lack of interpretability.The CLPsych 2024 Shared Task (Chim et al., 2024) aims to enhance the interpretability of Large Language Models (LLMs), particularly in mental health analysis, by providing evidence of suicidality through linguistic content. We propose a dual-prompting approach: (i) Knowledge-aware evidence extraction by leveraging the expert identity and a suicide dictionary with a mental health-specific LLM; and (ii) Evidence summarization by employing an LLM-based consistency evaluator. Comprehensive experiments demonstrate the effectiveness of combining domain-specific information, revealing performance improvements and the approach's potential to aid clinicians in assessing mental state progression.

A Dual-Prompting for Interpretable Mental Health Language Models

TL;DR

The paper tackles the interpretability challenge of LLM-based mental health analysis by introducing a dual-prompting framework that couples knowledge-aware evidence extraction with expert-identity prompts and a suicide dictionary, with an extract-then-generate summarization pipeline guided by a consistency evaluator. It leverages MentaLLaMA for domain-specific prompting and SOLAR for consistency-based selection, aiming to provide clinicians with traceable evidence across posts. Experiments on the CLPsych 2024 dataset show a modest but meaningful improvement in extraction recall when using few-shot examples and domain knowledge, while findings indicate general-domain LLMs can outperform domain-tuned models for summarization when paired with a consistency evaluator. The approach advances interpretable, evidence-backed mental-state monitoring, though it acknowledges limitations such as lack of ground truth and reliance on small-scale models, guiding future work toward broader prompts, more evaluators, and domain expansion.

Abstract

Despite the increasing demand for AI-based mental health monitoring tools, their practical utility for clinicians is limited by the lack of interpretability.The CLPsych 2024 Shared Task (Chim et al., 2024) aims to enhance the interpretability of Large Language Models (LLMs), particularly in mental health analysis, by providing evidence of suicidality through linguistic content. We propose a dual-prompting approach: (i) Knowledge-aware evidence extraction by leveraging the expert identity and a suicide dictionary with a mental health-specific LLM; and (ii) Evidence summarization by employing an LLM-based consistency evaluator. Comprehensive experiments demonstrate the effectiveness of combining domain-specific information, revealing performance improvements and the approach's potential to aid clinicians in assessing mental state progression.
Paper Structure (18 sections, 2 figures, 4 tables)

This paper contains 18 sections, 2 figures, 4 tables.

Figures (2)

  • Figure 1: The overall architecture of the proposed approach: (a) Knowledge-aware Evidence Extraction (§ \ref{['sec:task1']}) and (b) Evidence Summarization with LLM-based Consistency Evaluator (§ \ref{['sec:task2']})
  • Figure 2: Winner count comparison for MentaLLaMA yang2023mentalllama and SOLAR kim2023solar in 125 evaluation dataset using evaluator.