Table of Contents
Fetching ...

Guideline2Graph: Profile-Aware Multimodal Parsing for Executable Clinical Decision Graphs

Onur Selim Kilic, Yeti Z. Gurbuz, Cem O. Yaldiz, Afra Nawar, Etrit Haxholli, Ogul Can, Eli Waxman

Abstract

Clinical practice guidelines are long, multimodal documents whose branching recommendations are difficult to convert into executable clinical decision support (CDS), and one-shot parsing often breaks cross-page continuity. Recent LLM/VLM extractors are mostly local or text-centric, under-specifying section interfaces and failing to consolidate cross-page control flow across full documents into one coherent decision graph. We present a decomposition-first pipeline that converts full-guideline evidence into an executable clinical decision graph through topology-aware chunking, interface-constrained chunk graph generation, and provenance-preserving global aggregation. Rather than relying on single-pass generation, the pipeline uses explicit entry/terminal interfaces and semantic deduplication to preserve cross-page continuity while keeping the induced control flow auditable and structurally consistent. We evaluate on an adjudicated prostate-guideline benchmark with matched inputs and the same underlying VLM backbone across compared methods. On the complete merged graph, our approach improves edge and triplet precision/recall from $19.6\%/16.1\%$ in existing models to $69.0\%/87.5\%$, while node recall rises from $78.1\%$ to $93.8\%$. These results support decomposition-first, auditable guideline-to-CDS conversion on this benchmark, while current evidence remains limited to one adjudicated prostate guideline and motivates broader multi-guideline validation.

Guideline2Graph: Profile-Aware Multimodal Parsing for Executable Clinical Decision Graphs

Abstract

Clinical practice guidelines are long, multimodal documents whose branching recommendations are difficult to convert into executable clinical decision support (CDS), and one-shot parsing often breaks cross-page continuity. Recent LLM/VLM extractors are mostly local or text-centric, under-specifying section interfaces and failing to consolidate cross-page control flow across full documents into one coherent decision graph. We present a decomposition-first pipeline that converts full-guideline evidence into an executable clinical decision graph through topology-aware chunking, interface-constrained chunk graph generation, and provenance-preserving global aggregation. Rather than relying on single-pass generation, the pipeline uses explicit entry/terminal interfaces and semantic deduplication to preserve cross-page continuity while keeping the induced control flow auditable and structurally consistent. We evaluate on an adjudicated prostate-guideline benchmark with matched inputs and the same underlying VLM backbone across compared methods. On the complete merged graph, our approach improves edge and triplet precision/recall from in existing models to , while node recall rises from to . These results support decomposition-first, auditable guideline-to-CDS conversion on this benchmark, while current evidence remains limited to one adjudicated prostate guideline and motivates broader multi-guideline validation.

Paper Structure

This paper contains 17 sections, 3 figures, 2 tables, 3 algorithms.

Figures (3)

  • Figure 1: Overview of our profile-aware multimodal parsing framework. Unlike traditional practice and one-shot VLM summarization, our method uses topology-aware chunking, modular graph generation, and graph aggregation to preserve context and structure, yielding a scalable final graph for improved patient care.
  • Figure 2: Our detailed pipeline. Long CPGs are split into topology-aware chunks, each chunk graph is built via queue-based VLM expansion (with duplicate and ancestry updates), and all chunk graphs are iteratively merged into a final graph.
  • Figure 3: Qualitative comparison on one representative decision module. (A) AutoKG baseline output, (B) our output, and (C) adjudicated ground-truth graph. Our method better preserves path continuity and branching fidelity, with fewer spurious/fragmented transitions.