Table of Contents
Fetching ...

LLM-empowered Chatbots for Psychiatrist and Patient Simulation: Application and Evaluation

Siyuan Chen, Mengyue Wu, Kenny Q. Zhu, Kunyao Lan, Zhiling Zhang, Lyuchun Cui

TL;DR

This work addresses the lack of validated chatbots for psychiatric outpatient diagnosis by developing LLM-powered doctor and patient simulators using ChatGPT. It adopts a three-phase, human-centered design with iterative prompt engineering and a dual evaluation framework combining human judgments and automatic metrics, validated by real depression patients and psychiatrists. Key contributions include formalizing the task, establishing an evaluation framework tailored to diagnostic conversations, and demonstrating that carefully designed prompts can yield feasible, empathetic, and patient-like interactions in professional domains. The findings highlight design trade-offs and offer guidance for scalable psychiatric screening tools and medical education applications, while outlining ethical safeguards for data privacy and participant welfare.

Abstract

Empowering chatbots in the field of mental health is receiving increasing amount of attention, while there still lacks exploration in developing and evaluating chatbots in psychiatric outpatient scenarios. In this work, we focus on exploring the potential of ChatGPT in powering chatbots for psychiatrist and patient simulation. We collaborate with psychiatrists to identify objectives and iteratively develop the dialogue system to closely align with real-world scenarios. In the evaluation experiments, we recruit real psychiatrists and patients to engage in diagnostic conversations with the chatbots, collecting their ratings for assessment. Our findings demonstrate the feasibility of using ChatGPT-powered chatbots in psychiatric scenarios and explore the impact of prompt designs on chatbot behavior and user experience.

LLM-empowered Chatbots for Psychiatrist and Patient Simulation: Application and Evaluation

TL;DR

This work addresses the lack of validated chatbots for psychiatric outpatient diagnosis by developing LLM-powered doctor and patient simulators using ChatGPT. It adopts a three-phase, human-centered design with iterative prompt engineering and a dual evaluation framework combining human judgments and automatic metrics, validated by real depression patients and psychiatrists. Key contributions include formalizing the task, establishing an evaluation framework tailored to diagnostic conversations, and demonstrating that carefully designed prompts can yield feasible, empathetic, and patient-like interactions in professional domains. The findings highlight design trade-offs and offer guidance for scalable psychiatric screening tools and medical education applications, while outlining ethical safeguards for data privacy and participant welfare.

Abstract

Empowering chatbots in the field of mental health is receiving increasing amount of attention, while there still lacks exploration in developing and evaluating chatbots in psychiatric outpatient scenarios. In this work, we focus on exploring the potential of ChatGPT in powering chatbots for psychiatrist and patient simulation. We collaborate with psychiatrists to identify objectives and iteratively develop the dialogue system to closely align with real-world scenarios. In the evaluation experiments, we recruit real psychiatrists and patients to engage in diagnostic conversations with the chatbots, collecting their ratings for assessment. Our findings demonstrate the feasibility of using ChatGPT-powered chatbots in psychiatric scenarios and explore the impact of prompt designs on chatbot behavior and user experience.
Paper Structure (61 sections, 6 figures, 17 tables)

This paper contains 61 sections, 6 figures, 17 tables.

Figures (6)

  • Figure 1: The overview of the psychiatrist-guided three-phase study.
  • Figure 2: The iterative development process of the prompt of doctor chatbots. Psychiatrists will identify the limitations of the current version, and we will address these issues in the subsequent version.
  • Figure 3: The Proportion of Symptoms Asked by Different Doctor Chatbots and Human Doctor.
  • Figure 4: Dialogue Act Comparison between Different Doctor Chatbots and Human doctor.
  • Figure 5: The reponse generation process of ChatGPT-based chatbots. ① means combining the system message and the dialogue histroy together as the input of ChatGPT. ② means ChatGPT generates new response according to the input.
  • ...and 1 more figures