Table of Contents
Fetching ...

From Patient Consultations to Graphs: Leveraging LLMs for Patient Journey Knowledge Graph Construction

Hassan S. Al Khatib, Sudip Mittal, Shahram Rahimi, Nina Marhamati, Sean Bozorgzad

TL;DR

Fragmented patient data across care settings impedes coordinated care and outcome prediction. The authors propose a PJKG framework that fuses structured intake data with unstructured patient-provider conversations using LLMs to extract and semantically align entities and relations, enabling temporal and causal reasoning over patient journeys. They deliver an ontology-driven construction pipeline, a Neo4j-based implementation, and a comprehensive evaluation across four LLMs, demonstrating perfect schema compliance ($ICR=1.00$, $IPR=1.00$) but varying semantic accuracy and efficiency, with Anthropic leading in semantic quality and Llama 3.1 offering strong speed and scalability. The work highlights practical potential for patient-centric care and outlines future directions, including larger datasets, integration with Electronic Health Records (EHRs), and real-world deployments to support care coordination and outcome prediction.

Abstract

The transition towards patient-centric healthcare necessitates a comprehensive understanding of patient journeys, which encompass all healthcare experiences and interactions across the care spectrum. Existing healthcare data systems are often fragmented and lack a holistic representation of patient trajectories, creating challenges for coordinated care and personalized interventions. Patient Journey Knowledge Graphs (PJKGs) represent a novel approach to addressing the challenge of fragmented healthcare data by integrating diverse patient information into a unified, structured representation. This paper presents a methodology for constructing PJKGs using Large Language Models (LLMs) to process and structure both formal clinical documentation and unstructured patient-provider conversations. These graphs encapsulate temporal and causal relationships among clinical encounters, diagnoses, treatments, and outcomes, enabling advanced temporal reasoning and personalized care insights. The research evaluates four different LLMs, such as Claude 3.5, Mistral, Llama 3.1, and Chatgpt4o, in their ability to generate accurate and computationally efficient knowledge graphs. Results demonstrate that while all models achieved perfect structural compliance, they exhibited variations in medical entity processing and computational efficiency. The paper concludes by identifying key challenges and future research directions. This work contributes to advancing patient-centric healthcare through the development of comprehensive, actionable knowledge graphs that support improved care coordination and outcome prediction.

From Patient Consultations to Graphs: Leveraging LLMs for Patient Journey Knowledge Graph Construction

TL;DR

Fragmented patient data across care settings impedes coordinated care and outcome prediction. The authors propose a PJKG framework that fuses structured intake data with unstructured patient-provider conversations using LLMs to extract and semantically align entities and relations, enabling temporal and causal reasoning over patient journeys. They deliver an ontology-driven construction pipeline, a Neo4j-based implementation, and a comprehensive evaluation across four LLMs, demonstrating perfect schema compliance (, ) but varying semantic accuracy and efficiency, with Anthropic leading in semantic quality and Llama 3.1 offering strong speed and scalability. The work highlights practical potential for patient-centric care and outlines future directions, including larger datasets, integration with Electronic Health Records (EHRs), and real-world deployments to support care coordination and outcome prediction.

Abstract

The transition towards patient-centric healthcare necessitates a comprehensive understanding of patient journeys, which encompass all healthcare experiences and interactions across the care spectrum. Existing healthcare data systems are often fragmented and lack a holistic representation of patient trajectories, creating challenges for coordinated care and personalized interventions. Patient Journey Knowledge Graphs (PJKGs) represent a novel approach to addressing the challenge of fragmented healthcare data by integrating diverse patient information into a unified, structured representation. This paper presents a methodology for constructing PJKGs using Large Language Models (LLMs) to process and structure both formal clinical documentation and unstructured patient-provider conversations. These graphs encapsulate temporal and causal relationships among clinical encounters, diagnoses, treatments, and outcomes, enabling advanced temporal reasoning and personalized care insights. The research evaluates four different LLMs, such as Claude 3.5, Mistral, Llama 3.1, and Chatgpt4o, in their ability to generate accurate and computationally efficient knowledge graphs. Results demonstrate that while all models achieved perfect structural compliance, they exhibited variations in medical entity processing and computational efficiency. The paper concludes by identifying key challenges and future research directions. This work contributes to advancing patient-centric healthcare through the development of comprehensive, actionable knowledge graphs that support improved care coordination and outcome prediction.

Paper Structure

This paper contains 18 sections, 13 equations, 6 figures, 3 tables.

Figures (6)

  • Figure 1: A visual representation of the ontology structure for the PJKG, showcasing key entities (e.g., Patient, Encounter, Diagnosis, Symptoms) and their relationships, capturing the comprehensive flow of patient care data.
  • Figure 2: A visual depiction of the process to create a PJKG using ontology-based structured prompts and transcribed patient-provider conversations, processed by an LLM to extract NER and RE into JSON format, then loaded into graph DB for visualization and analysis.
  • Figure 3: Visualizing the dynamic evolution of a PJKG across five clinical encounters, highlighting the integration of new diagnostic, treatment, and care plan information at each stage of the patient’s journey.
  • Figure 4: Comparison of node counts across PJKGs generated by different LLMs. The chart illustrates the distribution of node types, such as Symptom, Diagnosis, and DiagnosticTest, reflecting differences in node completeness across LLM-generated graphs.
  • Figure 5: Comparison of the relationship counts across PJKGs built using different LLMs. The chart highlights the distribution of various relationship types, such as HAS_DIAGNOSIS, HAS_SYMPTOM, and HAS_TEST, indicating variations in the completeness of relationship representation.
  • ...and 1 more figures