Table of Contents
Fetching ...

SMILE: Single-turn to Multi-turn Inclusive Language Expansion via ChatGPT for Mental Health Support

Huachuan Qiu, Hongliang He, Shuai Zhang, Anqi Li, Zhenzhong Lan

TL;DR

<3-5 sentence high-level summary> SMILE tackles the lack of large-scale, diverse multi-turn mental health dialogues by converting public single-turn QAs into multi-turn conversations using ChatGPT prompts. The approach yields SMILECHAT, a 55k-scale Chinese dataset, and a downstream chatbot MeChat that benefits from parameter-efficient fine-tuning on ChatGLM2-6B. Through language transformation and diversity analyses, the authors demonstrate lifelike, diverse dialogue generation and validate quality with automatic metrics and human evaluation on PsyTest, a real-life anonymized dataset. The work provides public release of data, code, and model, and shows potential applicability to other domains beyond mental health.>

Abstract

Developing specialized dialogue systems for mental health support requires multi-turn conversation data, which has recently garnered increasing attention. However, gathering and releasing large-scale, real-life multi-turn conversations that could facilitate advancements in mental health support presents challenges in data privacy protection and the time and cost involved in crowdsourcing. To address these challenges, we introduce SMILE, a single-turn to multi-turn inclusive language expansion technique that prompts ChatGPT to rewrite public single-turn dialogues into multi-turn ones. Our work begins by analyzing language transformation and validating the feasibility of our proposed method. We conduct a study on dialogue diversity, including lexical features, semantic features, and dialogue topics, demonstrating the effectiveness of our method. Further, we employ our method to generate a large-scale, lifelike, and diverse dialogue dataset named SMILECHAT, consisting of 55k dialogues. Finally, we utilize the collected corpus to develop a mental health chatbot, MeChat. To better assess the quality of SMILECHAT, we collect a small-scale real-life counseling dataset conducted by data anonymization. Both automatic and human evaluations demonstrate significant improvements in our dialogue system and confirm that SMILECHAT is high-quality. Code, data, and model are publicly available at https://github.com/qiuhuachuan/smile.

SMILE: Single-turn to Multi-turn Inclusive Language Expansion via ChatGPT for Mental Health Support

TL;DR

<3-5 sentence high-level summary> SMILE tackles the lack of large-scale, diverse multi-turn mental health dialogues by converting public single-turn QAs into multi-turn conversations using ChatGPT prompts. The approach yields SMILECHAT, a 55k-scale Chinese dataset, and a downstream chatbot MeChat that benefits from parameter-efficient fine-tuning on ChatGLM2-6B. Through language transformation and diversity analyses, the authors demonstrate lifelike, diverse dialogue generation and validate quality with automatic metrics and human evaluation on PsyTest, a real-life anonymized dataset. The work provides public release of data, code, and model, and shows potential applicability to other domains beyond mental health.>

Abstract

Developing specialized dialogue systems for mental health support requires multi-turn conversation data, which has recently garnered increasing attention. However, gathering and releasing large-scale, real-life multi-turn conversations that could facilitate advancements in mental health support presents challenges in data privacy protection and the time and cost involved in crowdsourcing. To address these challenges, we introduce SMILE, a single-turn to multi-turn inclusive language expansion technique that prompts ChatGPT to rewrite public single-turn dialogues into multi-turn ones. Our work begins by analyzing language transformation and validating the feasibility of our proposed method. We conduct a study on dialogue diversity, including lexical features, semantic features, and dialogue topics, demonstrating the effectiveness of our method. Further, we employ our method to generate a large-scale, lifelike, and diverse dialogue dataset named SMILECHAT, consisting of 55k dialogues. Finally, we utilize the collected corpus to develop a mental health chatbot, MeChat. To better assess the quality of SMILECHAT, we collect a small-scale real-life counseling dataset conducted by data anonymization. Both automatic and human evaluations demonstrate significant improvements in our dialogue system and confirm that SMILECHAT is high-quality. Code, data, and model are publicly available at https://github.com/qiuhuachuan/smile.
Paper Structure (62 sections, 5 equations, 23 figures, 6 tables)

This paper contains 62 sections, 5 equations, 23 figures, 6 tables.

Figures (23)

  • Figure 1: The SMILE method used to generate dialogues for mental health support.
  • Figure 2: The SMILE method used to generate dialogues for mental health support.
  • Figure 3: Mechanism for language transformation.
  • Figure 4: Distribution of dialogue transformation among three methods. The line $x=0.9312$ represents the boundary of $\mu -3\sigma$ in the SMILE method.
  • Figure 5: Pairwise dialogue cosine similarity among four settings: our proposed three methods and a reference point using sampled data from PsyQA.
  • ...and 18 more figures