Interactive Agents: Simulating Counselor-Client Psychological Counseling via Role-Playing LLM-to-LLM Interactions
Huachuan Qiu, Zhenzhong Lan
TL;DR
This work introduces a two-LLM role-play framework to simulate counselor-client psychological counseling sessions, addressing scalability and privacy barriers of human data. By using an LLM-based client with real-life profiles and an integrative therapy–driven counselor, the authors generate a large synthetic dataset, SimPsyDial, and train a dialogue model SimPsyBot on it. They thoroughly evaluate both client fidelity and counselor quality, using WAI-O-S and distributional analyses, and show that the synthetic data can outperform state-of-the-art mental-health models in both automatic and human evaluations. The study highlights the potential of LLM-driven simulations to advance mental health dialogue systems while acknowledging ethical considerations and proposing future enhancements such as resistance modeling and retrieval-augmented generation.
Abstract
Virtual counselors powered by large language models (LLMs) aim to create interactive support systems that effectively assist clients struggling with mental health challenges. To replicate counselor-client conversations, researchers have built an online mental health platform that allows professional counselors to provide clients with text-based counseling services for about an hour per session. Notwithstanding its effectiveness, challenges exist as human annotation is time-consuming, cost-intensive, privacy-protected, and not scalable. To address this issue and investigate the applicability of LLMs in psychological counseling conversation simulation, we propose a framework that employs two LLMs via role-playing for simulating counselor-client interactions. Our framework involves two LLMs, one acting as a client equipped with a specific and real-life user profile and the other playing the role of an experienced counselor, generating professional responses using integrative therapy techniques. We implement both the counselor and the client by zero-shot prompting the GPT-4 model. In order to assess the effectiveness of LLMs in simulating counselor-client interactions and understand the disparities between LLM- and human-generated conversations, we evaluate the synthetic data from various perspectives. We begin by assessing the client's performance through automatic evaluations. Next, we analyze and compare the disparities between dialogues generated by the LLM and those generated by professional counselors. Furthermore, we conduct extensive experiments to thoroughly examine the performance of our LLM-based counselor trained with synthetic interactive dialogues by benchmarking against state-of-the-art models for mental health.
