Table of Contents
Fetching ...

MedSimAI: Simulation and Formative Feedback Generation to Enhance Deliberate Practice in Medical Education

Yann Hicke, Jadon Geathers, Kellen Vu, Justin Sewell, Claire Cardie, Jaideep Talwalkar, Dennis Shung, Anyanate Gwendolyne Jack, Susannah Cornes, Mackenzi Preston, Rene Kizilcec

TL;DR

MedSimAI addresses a core challenge in medical education: scalable, consistent training of clinical skills and communication with effective formative feedback. It introduces an AI-powered simulation platform featuring AI-standardized patients, multi-rubric assessment (including MIRS), and an SRL-oriented Learning Hub, designed to support deliberate practice across institutions. In a multi-site deployment (410 learners, 1,024 encounters), the system showed a meaningful OSCE improvement at one site ($p<0.001$, $d=0.75$) and strong automated scoring validity ($87.0 ext{%}$ thresholded accuracy), while engagement and SRL uptake depended on curricular integration. The work demonstrates that AI-assisted simulations can be scalable and informative for formative learning, but real-world impact hinges on curricular embedding, realistic interaction design, and explicit SRL scaffolds to guide learner reflection and progression.

Abstract

Medical education faces challenges in providing scalable, consistent clinical skills training. Simulation with standardized patients (SPs) develops communication and diagnostic skills but remains resource-intensive and variable in feedback quality. Existing AI-based tools show promise yet often lack comprehensive assessment frameworks, evidence of clinical impact, and integration of self-regulated learning (SRL) principles. Through a multi-phase co-design process with medical education experts, we developed MedSimAI, an AI-powered simulation platform that enables deliberate practice through interactive patient encounters with immediate, structured feedback. Leveraging large language models, MedSimAI generates realistic clinical interactions and provides automated assessments aligned with validated evaluation frameworks. In a multi-institutional deployment (410 students; 1,024 encounters across three medical schools), 59.5 percent engaged in repeated practice. At one site, mean Objective Structured Clinical Examination (OSCE) history-taking scores rose from 82.8 to 88.8 (p < 0.001, Cohen's d = 0.75), while a second site's pilot showed no significant change. Automated scoring achieved 87 percent accuracy in identifying proficiency thresholds on the Master Interview Rating Scale (MIRS). Mixed-effects analyses revealed institution and case effects. Thematic analysis of 840 learner reflections highlighted challenges in missed items, organization, review of systems, and empathy. These findings position MedSimAI as a scalable formative platform for history-taking and communication, motivating staged curriculum integration and realism enhancements for advanced learners.

MedSimAI: Simulation and Formative Feedback Generation to Enhance Deliberate Practice in Medical Education

TL;DR

MedSimAI addresses a core challenge in medical education: scalable, consistent training of clinical skills and communication with effective formative feedback. It introduces an AI-powered simulation platform featuring AI-standardized patients, multi-rubric assessment (including MIRS), and an SRL-oriented Learning Hub, designed to support deliberate practice across institutions. In a multi-site deployment (410 learners, 1,024 encounters), the system showed a meaningful OSCE improvement at one site (, ) and strong automated scoring validity ( thresholded accuracy), while engagement and SRL uptake depended on curricular integration. The work demonstrates that AI-assisted simulations can be scalable and informative for formative learning, but real-world impact hinges on curricular embedding, realistic interaction design, and explicit SRL scaffolds to guide learner reflection and progression.

Abstract

Medical education faces challenges in providing scalable, consistent clinical skills training. Simulation with standardized patients (SPs) develops communication and diagnostic skills but remains resource-intensive and variable in feedback quality. Existing AI-based tools show promise yet often lack comprehensive assessment frameworks, evidence of clinical impact, and integration of self-regulated learning (SRL) principles. Through a multi-phase co-design process with medical education experts, we developed MedSimAI, an AI-powered simulation platform that enables deliberate practice through interactive patient encounters with immediate, structured feedback. Leveraging large language models, MedSimAI generates realistic clinical interactions and provides automated assessments aligned with validated evaluation frameworks. In a multi-institutional deployment (410 students; 1,024 encounters across three medical schools), 59.5 percent engaged in repeated practice. At one site, mean Objective Structured Clinical Examination (OSCE) history-taking scores rose from 82.8 to 88.8 (p < 0.001, Cohen's d = 0.75), while a second site's pilot showed no significant change. Automated scoring achieved 87 percent accuracy in identifying proficiency thresholds on the Master Interview Rating Scale (MIRS). Mixed-effects analyses revealed institution and case effects. Thematic analysis of 840 learner reflections highlighted challenges in missed items, organization, review of systems, and empathy. These findings position MedSimAI as a scalable formative platform for history-taking and communication, motivating staged curriculum integration and realism enhancements for advanced learners.

Paper Structure

This paper contains 41 sections, 5 figures, 3 tables.

Figures (5)

  • Figure 1: The MedSimAI design and development timeline, occurring in three phases with continuous SME collaboration.
  • Figure 2: MedSimAI workflow: Students select cases, interact via chat or voice, and receive automated feedback based on multiple assessment frameworks. Not pictured are the SRL-enhancing components shown in Figure \ref{['fig:learning-hub']}.
  • Figure 3: Learning Hub integrates five SRL components using clinical metaphors.
  • Figure 4: OSCE history-taking performance at Institution A. Bars show mean scores with standard deviations (Pre-platform $n=100$: $82.8\pm7.6$; Post-platform $n=100$: $88.8\pm8.5$). Difference $=6.0^{***}$, effect size $d=0.75$. $^{***}\!p<0.001$ (two-sided Welch $t$-test).
  • Figure 5: DOCS performance, Institution B (88 volunteers vs. 91 non‑participants)