MedTutor-R1: Socratic Personalized Medical Teaching with Multi-Agent Simulation
Zhitao He, Haolin Yang, Zeyu Qin, Yi R Fung
TL;DR
The paper tackles the shortage of expert clinical teaching by introducing ClinEdu, a high-fidelity multi-agent simulator, and ClinTeach, a large-scale Socratic teaching dataset used to train MedTutor-R1, the first multimodal tutor for one-to-many clinical instruction. MedTutor-R1 is initially supervised with ClinTeach and then refined via reinforcement learning using a three-axis rubric (structural fidelity, analytical quality, clinical safety) to optimize adaptive Socratic strategies. Its performance is evaluated through simulation-based in-situ testing within ClinEdu, showing substantial improvements over a base model and competitive parity with strong baselines, with demonstrated robustness across varying class sizes. The work offers a scalable, data-driven framework for enhancing group-based medical education, potentially broadening access to high-quality clinical training while maintaining safety and pedagogical rigor.
Abstract
The significant gap between rising demands for clinical training and the scarcity of expert instruction poses a major challenge to medical education. With powerful capabilities in personalized guidance, Large Language Models (LLMs) offer a promising solution to bridge this gap. However, current research focuses mainly on one-on-one knowledge instruction, overlooking collaborative reasoning, a key skill for students developed in teamwork like ward rounds. To this end, we develop ClinEdu, a multi-agent pedagogical simulator with personality-driven patients and diverse student cohorts, enabling controlled testing of complex pedagogical processes and scalable generation of teaching data. Based on ClinEdu, we construct ClinTeach, a large Socratic teaching dialogue dataset that captures the complexities of group instruction. We then train MedTutor-R1, the first multimodal Socratic tutor designed for one-to-many instruction in clinical medical education. MedTutor-R1 is first instruction-tuned on our ClinTeach dataset and then optimized with reinforcement learning, using rewards derived from a three-axis rubric, covering structural fidelity, analytical quality, and clinical safety, to refine its adaptive Socratic strategies. For authentic in-situ assessment, we use simulation-based interactive evaluation that redeploys the tutor back into ClinEdu. Experimental results demonstrate that our MedTutor-R1 outperforms the base model by over 20% in average pedagogical score and is comparable to o3, while also exhibiting high adaptability in handling a varying number of students. This promising performance underscores the effectiveness of our pedagogical simulator, ClinEdu.
