Table of Contents
Fetching ...

MedPAO: A Protocol-Driven Agent for Structuring Medical Reports

Shrish Shrinath Vaidya, Gowthamaan Palani, Sidharth Ramesh, Velmurugan Balasubramanian, Minmini Selvam, Gokulraja Srinivasaraja, Ganapathy Krishnamurthi

TL;DR

MedPAO introduces a protocol-driven agent for structuring medical reports by grounding reasoning in established clinical workflows, notably the ABCDEF chest X-ray protocol, and orchestrating a Plan-Act-Observe loop with a modular set of tools. By combining a capable LLM with a model-context protocol and six specialized tools (concept extraction, ontology mapping, ontology filtering, concept categorization, report generation, and caching), the framework delivers protocol-compliant, verifiable outputs and reduces hallucinations common in monolithic LLMs. Empirical results show a $F1$-score of $0.96$ on concept categorization and strong clinician radiologist ratings (average $4.52$–$4.59$/5), surpassing baseline LLMs and highlighting improved reliability for structured radiology reporting. The approach is modality-agnostic and scalable, with potential extensions to additional imaging types and real-time deployment through hardware optimization and image-integrated workflows.

Abstract

The deployment of Large Language Models (LLMs) for structuring clinical data is critically hindered by their tendency to hallucinate facts and their inability to follow domain-specific rules. To address this, we introduce MedPAO, a novel agentic framework that ensures accuracy and verifiable reasoning by grounding its operation in established clinical protocols such as the ABCDEF protocol for CXR analysis. MedPAO decomposes the report structuring task into a transparent process managed by a Plan-Act-Observe (PAO) loop and specialized tools. This protocol-driven method provides a verifiable alternative to opaque, monolithic models. The efficacy of our approach is demonstrated through rigorous evaluation: MedPAO achieves an F1-score of 0.96 on the critical sub-task of concept categorization. Notably, expert radiologists and clinicians rated the final structured outputs with an average score of 4.52 out of 5, indicating a level of reliability that surpasses baseline approaches relying solely on LLM-based foundation models. The code is available at: https://github.com/MiRL-IITM/medpao-agent

MedPAO: A Protocol-Driven Agent for Structuring Medical Reports

TL;DR

MedPAO introduces a protocol-driven agent for structuring medical reports by grounding reasoning in established clinical workflows, notably the ABCDEF chest X-ray protocol, and orchestrating a Plan-Act-Observe loop with a modular set of tools. By combining a capable LLM with a model-context protocol and six specialized tools (concept extraction, ontology mapping, ontology filtering, concept categorization, report generation, and caching), the framework delivers protocol-compliant, verifiable outputs and reduces hallucinations common in monolithic LLMs. Empirical results show a -score of on concept categorization and strong clinician radiologist ratings (average /5), surpassing baseline LLMs and highlighting improved reliability for structured radiology reporting. The approach is modality-agnostic and scalable, with potential extensions to additional imaging types and real-time deployment through hardware optimization and image-integrated workflows.

Abstract

The deployment of Large Language Models (LLMs) for structuring clinical data is critically hindered by their tendency to hallucinate facts and their inability to follow domain-specific rules. To address this, we introduce MedPAO, a novel agentic framework that ensures accuracy and verifiable reasoning by grounding its operation in established clinical protocols such as the ABCDEF protocol for CXR analysis. MedPAO decomposes the report structuring task into a transparent process managed by a Plan-Act-Observe (PAO) loop and specialized tools. This protocol-driven method provides a verifiable alternative to opaque, monolithic models. The efficacy of our approach is demonstrated through rigorous evaluation: MedPAO achieves an F1-score of 0.96 on the critical sub-task of concept categorization. Notably, expert radiologists and clinicians rated the final structured outputs with an average score of 4.52 out of 5, indicating a level of reliability that surpasses baseline approaches relying solely on LLM-based foundation models. The code is available at: https://github.com/MiRL-IITM/medpao-agent

Paper Structure

This paper contains 22 sections, 5 figures, 5 tables.

Figures (5)

  • Figure 1: Illustrative example of MedPAO
  • Figure 2: Proposed agent vs SOTA LLMs on medical concept categorization task according to ABCDEF protocol
  • Figure 3: Architecture of proposed MedPAO agent for medical report processing tasks.
  • Figure 4: Step by step output of our agent in structuring the free style report according to protocol Jones_etal_CXR_2025
  • Figure 5: Confusion matrices over concept categorization task by the respective models