Table of Contents
Fetching ...

Towards a Client-Centered Assessment of LLM Therapists by Client Simulation

Jiashuo Wang, Yang Xiao, Yanran Li, Changhe Song, Chunpu Xu, Chenhao Tan, Wenjie Li

TL;DR

This work adopts LLMs to simulate clients and proposes ClientCAST, a client-centered approach to assessing LLM therapists by client simulation, which is utilized to interact with LLM therapists and complete questionnaires related to the interaction.

Abstract

Although there is a growing belief that LLMs can be used as therapists, exploring LLMs' capabilities and inefficacy, particularly from the client's perspective, is limited. This work focuses on a client-centered assessment of LLM therapists with the involvement of simulated clients, a standard approach in clinical medical education. However, there are two challenges when applying the approach to assess LLM therapists at scale. Ethically, asking humans to frequently mimic clients and exposing them to potentially harmful LLM outputs can be risky and unsafe. Technically, it can be difficult to consistently compare the performances of different LLM therapists interacting with the same client. To this end, we adopt LLMs to simulate clients and propose ClientCAST, a client-centered approach to assessing LLM therapists by client simulation. Specifically, the simulated client is utilized to interact with LLM therapists and complete questionnaires related to the interaction. Based on the questionnaire results, we assess LLM therapists from three client-centered aspects: session outcome, therapeutic alliance, and self-reported feelings. We conduct experiments to examine the reliability of ClientCAST and use it to evaluate LLMs therapists implemented by Claude-3, GPT-3.5, LLaMA3-70B, and Mixtral 8*7B. Codes are released at https://github.com/wangjs9/ClientCAST.

Towards a Client-Centered Assessment of LLM Therapists by Client Simulation

TL;DR

This work adopts LLMs to simulate clients and proposes ClientCAST, a client-centered approach to assessing LLM therapists by client simulation, which is utilized to interact with LLM therapists and complete questionnaires related to the interaction.

Abstract

Although there is a growing belief that LLMs can be used as therapists, exploring LLMs' capabilities and inefficacy, particularly from the client's perspective, is limited. This work focuses on a client-centered assessment of LLM therapists with the involvement of simulated clients, a standard approach in clinical medical education. However, there are two challenges when applying the approach to assess LLM therapists at scale. Ethically, asking humans to frequently mimic clients and exposing them to potentially harmful LLM outputs can be risky and unsafe. Technically, it can be difficult to consistently compare the performances of different LLM therapists interacting with the same client. To this end, we adopt LLMs to simulate clients and propose ClientCAST, a client-centered approach to assessing LLM therapists by client simulation. Specifically, the simulated client is utilized to interact with LLM therapists and complete questionnaires related to the interaction. Based on the questionnaire results, we assess LLM therapists from three client-centered aspects: session outcome, therapeutic alliance, and self-reported feelings. We conduct experiments to examine the reliability of ClientCAST and use it to evaluate LLMs therapists implemented by Claude-3, GPT-3.5, LLaMA3-70B, and Mixtral 8*7B. Codes are released at https://github.com/wangjs9/ClientCAST.
Paper Structure (58 sections, 6 equations, 10 figures, 18 tables)

This paper contains 58 sections, 6 equations, 10 figures, 18 tables.

Figures (10)

  • Figure 1: This is the overview framework of ClientCAST. It utilizes an LLM as a simulated client equipped with a specific psychological profile. The simulated client interacts with an LLM therapist and completes questionnaires regarding their interaction. Finally, ClientCAST provides a client-centered assessment of the LLM therapist based on the results of questionnaires.
  • Figure 2: Session outcome, therapeutic alliance and self-reported feelings scores of high- and low-quality sessions in High-Low Quality Counseling and AnnoMI datasets.
  • Figure 3: The proportion of inconsistent simulated clients who exhibit a higher level of apparent traits. EF: Emotion Fluctuations, UWE: UnWillingness to express emotions, RT: Resistance toward the Therapist.
  • Figure 4: LLM therapist assessments on session outcome, therapeutic alliance, and self-reported feelings using ClientCAST.
  • Figure 5: The prompt to extract psychological profile from a counseling session.
  • ...and 5 more figures