Modeling Challenging Patient Interactions: LLMs for Medical Communication Training
Anna Bodonhelyi, Christian Stegemann-Philipps, Alessandra Sonanini, Lea Herschbach, Marton Szep, Anne Herrmann-Werner, Teresa Festl-Wietek, Enkelejda Kasneci, Friederike Holderried
TL;DR
The paper tackles the gap in medical training realism by engineering multilingual virtual patients that embody Satir-inspired accuser and rationalizer personas using prompt engineering and illness scripts. It introduces a framework combining author notes, first messages, and a stubbornness mechanism to produce authentic, emotionally textured interactions evaluated by psychotherapy professionals at an international conference. Through Likert-based assessments and automated emotion/sentiment analysis, the study demonstrates that participants perceived the VPs as authentic (≈3.7–3.8/5) and could correctly identify their styles, with emotion profiles aligning to the intended personas. The work argues that AI-driven virtual patients offer scalable, cost-effective tools to prepare clinicians for challenging interpersonal dynamics, while acknowledging limitations and outlining future work to broaden stylistic repertoires and data integration.
Abstract
Effective patient communication is pivotal in healthcare, yet traditional medical training often lacks exposure to diverse, challenging interpersonal dynamics. To bridge this gap, this study proposes the use of Large Language Models (LLMs) to simulate authentic patient communication styles, specifically the "accuser" and "rationalizer" personas derived from the Satir model, while also ensuring multilingual applicability to accommodate diverse cultural contexts and enhance accessibility for medical professionals. Leveraging advanced prompt engineering, including behavioral prompts, author's notes, and stubbornness mechanisms, we developed virtual patients (VPs) that embody nuanced emotional and conversational traits. Medical professionals evaluated these VPs, rating their authenticity (accuser: $3.8 \pm 1.0$; rationalizer: $3.7 \pm 0.8$ on a 5-point Likert scale (from one to five)) and correctly identifying their styles. Emotion analysis revealed distinct profiles: the accuser exhibited pain, anger, and distress, while the rationalizer displayed contemplation and calmness, aligning with predefined, detailed patient description including medical history. Sentiment scores (on a scale from zero to nine) further validated these differences in the communication styles, with the accuser adopting negative ($3.1 \pm 0.6$) and the rationalizer more neutral ($4.0 \pm 0.4$) tone. These results underscore LLMs' capability to replicate complex communication styles, offering transformative potential for medical education. This approach equips trainees to navigate challenging clinical scenarios by providing realistic, adaptable patient interactions, enhancing empathy and diagnostic acumen. Our findings advocate for AI-driven tools as scalable, cost-effective solutions to cultivate nuanced communication skills, setting a foundation for future innovations in healthcare training.
