From Patient Consultations to Graphs: Leveraging LLMs for Patient Journey Knowledge Graph Construction
Hassan S. Al Khatib, Sudip Mittal, Shahram Rahimi, Nina Marhamati, Sean Bozorgzad
TL;DR
Fragmented patient data across care settings impedes coordinated care and outcome prediction. The authors propose a PJKG framework that fuses structured intake data with unstructured patient-provider conversations using LLMs to extract and semantically align entities and relations, enabling temporal and causal reasoning over patient journeys. They deliver an ontology-driven construction pipeline, a Neo4j-based implementation, and a comprehensive evaluation across four LLMs, demonstrating perfect schema compliance ($ICR=1.00$, $IPR=1.00$) but varying semantic accuracy and efficiency, with Anthropic leading in semantic quality and Llama 3.1 offering strong speed and scalability. The work highlights practical potential for patient-centric care and outlines future directions, including larger datasets, integration with Electronic Health Records (EHRs), and real-world deployments to support care coordination and outcome prediction.
Abstract
The transition towards patient-centric healthcare necessitates a comprehensive understanding of patient journeys, which encompass all healthcare experiences and interactions across the care spectrum. Existing healthcare data systems are often fragmented and lack a holistic representation of patient trajectories, creating challenges for coordinated care and personalized interventions. Patient Journey Knowledge Graphs (PJKGs) represent a novel approach to addressing the challenge of fragmented healthcare data by integrating diverse patient information into a unified, structured representation. This paper presents a methodology for constructing PJKGs using Large Language Models (LLMs) to process and structure both formal clinical documentation and unstructured patient-provider conversations. These graphs encapsulate temporal and causal relationships among clinical encounters, diagnoses, treatments, and outcomes, enabling advanced temporal reasoning and personalized care insights. The research evaluates four different LLMs, such as Claude 3.5, Mistral, Llama 3.1, and Chatgpt4o, in their ability to generate accurate and computationally efficient knowledge graphs. Results demonstrate that while all models achieved perfect structural compliance, they exhibited variations in medical entity processing and computational efficiency. The paper concludes by identifying key challenges and future research directions. This work contributes to advancing patient-centric healthcare through the development of comprehensive, actionable knowledge graphs that support improved care coordination and outcome prediction.
