Human or LLM as Standardized Patients? A Comparative Study for Medical Education
Bingquan Zhang, Xiaoxiao Liu, Yuchi Wang, Lei Zhou, Qianqian Xie, Benyou Wang
TL;DR
This paper proposes EasyMED, a multi-agent framework for virtual Standardized Patient (SP) training that separates patient simulation, intent recognition, and evaluation into coordinated components. It also introduces SPBench, a benchmark built from authentic SP–student dialogues across 14 specialties and eight evaluation criteria, enabling reproducible, turn- and session-level assessment. In a four-week controlled crossover study with undergraduate medical students, EasyMED delivered learning outcomes comparable to traditional human SP training while offering greater flexibility, psychological safety, and substantial cost reductions, with pronounced benefits for lower-baseline learners. The work demonstrates the feasibility and pedagogical value of multi-agent, LLM-based SP systems for scalable, high-quality clinical education and provides a path toward widespread, cost-effective skills training in medical education.
Abstract
Standardized Patients (SP) are indispensable for clinical skills training but remain expensive, inflexible, and difficult to scale. Existing large-language-model (LLM)-based SP simulators promise lower cost yet show inconsistent behavior and lack rigorous comparison with human SP. We present EasyMED, a multi-agent framework combining a Patient Agent for realistic dialogue, an Auxiliary Agent for factual consistency, and an Evaluation Agent that delivers actionable feedback. To support systematic assessment, we introduce SPBench, a benchmark of real SP-doctor interactions spanning 14 specialties and eight expert-defined evaluation criteria. Experiments demonstrate that EasyMED matches human SP learning outcomes while producing greater skill gains for lower-baseline students and offering improved flexibility, psychological safety, and cost efficiency.
