Table of Contents
Fetching ...

H-AdminSim: A Multi-Agent Simulator for Realistic Hospital Administrative Workflows with FHIR Integration

Jun-Min Lee, Meong Hi Son, Edward Choi

TL;DR

H-AdminSim addresses the gap in hospital administration research by offering an end-to-end simulation of outpatient administrative workflows. It combines synthetic data generation, multi-agent dialogue, and FHIR integration to enable systematic evaluation of LLM-based automation across hospital levels. Experiments show tool-based scheduling is robust when tools exist, while intake remains the primary bottleneck, guiding infrastructure focus. By standardizing the testbed for interoperability and variation across primary, secondary, and tertiary settings, H-AdminSim facilitates benchmarking and feasibility studies for LLM-driven hospital administration.

Abstract

Hospital administration departments handle a wide range of operational tasks and, in large hospitals, process over 10,000 requests per day, driving growing interest in LLM-based automation. However, prior work has focused primarily on patient--physician interactions or isolated administrative subtasks, failing to capture the complexity of real administrative workflows. To address this gap, we propose H-AdminSim, a comprehensive end-to-end simulation framework that combines realistic data generation with multi-agent-based simulation of hospital administrative workflows. These tasks are quantitatively evaluated using detailed rubrics, enabling systematic comparison of LLMs. Through FHIR integration, H-AdminSim provides a unified and interoperable environment for testing administrative workflows across heterogeneous hospital settings, serving as a standardized testbed for assessing the feasibility and performance of LLM-driven administrative automation.

H-AdminSim: A Multi-Agent Simulator for Realistic Hospital Administrative Workflows with FHIR Integration

TL;DR

H-AdminSim addresses the gap in hospital administration research by offering an end-to-end simulation of outpatient administrative workflows. It combines synthetic data generation, multi-agent dialogue, and FHIR integration to enable systematic evaluation of LLM-based automation across hospital levels. Experiments show tool-based scheduling is robust when tools exist, while intake remains the primary bottleneck, guiding infrastructure focus. By standardizing the testbed for interoperability and variation across primary, secondary, and tertiary settings, H-AdminSim facilitates benchmarking and feasibility studies for LLM-driven hospital administration.

Abstract

Hospital administration departments handle a wide range of operational tasks and, in large hospitals, process over 10,000 requests per day, driving growing interest in LLM-based automation. However, prior work has focused primarily on patient--physician interactions or isolated administrative subtasks, failing to capture the complexity of real administrative workflows. To address this gap, we propose H-AdminSim, a comprehensive end-to-end simulation framework that combines realistic data generation with multi-agent-based simulation of hospital administrative workflows. These tasks are quantitatively evaluated using detailed rubrics, enabling systematic comparison of LLMs. Through FHIR integration, H-AdminSim provides a unified and interoperable environment for testing administrative workflows across heterogeneous hospital settings, serving as a standardized testbed for assessing the feasibility and performance of LLM-driven administrative automation.
Paper Structure (47 sections, 22 figures, 14 tables, 1 algorithm)

This paper contains 47 sections, 22 figures, 14 tables, 1 algorithm.

Figures (22)

  • Figure 1: Diagram of the hospital administration simulation. Synthesized hospital data populate the hospital information system (HIS) with physician and hospital information using FHIR. The framework then simulates patient intake and appointment scheduling, uploading the resulting patient and appointment records to the HIS while keeping physician schedules updated via real-time FHIR communication. The hospital simulation data shown in the upper-left are shown in Appendix \ref{['appendix:data synthesis details']}.
  • Figure 2: Comparison of intake task success rates across models under varying prior-diagnosis settings. Performance improves with higher prior-diagnosis rates and longer patient-staff dialogues (hatched bars).
  • Figure 3: Department assignment errors under six conditions (three prior-diagnosis settings × two conversation-round settings). Errors decrease with more patients having prior diagnoses and with longer conversations.
  • Figure 4: Overview of the hierarchical structure of the data synthesis procedure. The left panel presents the configuration parameters that govern data generation, and the right panel illustrates the hierarchical synthesis of hospital, department, physician, and patient data. Physician schedules consist of busy and free time slots, where free slots represent periods available for outpatient appointments, and a subset of these free slots is allocated as appointment blocks.
  • Figure 5: System and user prompts used by the LLM for post-processing crawled data and extracting structured disease--symptom pairs. {DISEASE} and {WEBPAGE} denote placeholders for the disease name and its corresponding crawled webpage, respectively.
  • ...and 17 more figures