CliCARE: Grounding Large Language Models in Clinical Guidelines for Decision Support over Longitudinal Cancer Electronic Health Records

Dongchen Li; Jitao Liang; Wei Li; Xiaoyu Wang; Longbing Cao; Kun Yu

CliCARE: Grounding Large Language Models in Clinical Guidelines for Decision Support over Longitudinal Cancer Electronic Health Records

Dongchen Li, Jitao Liang, Wei Li, Xiaoyu Wang, Longbing Cao, Kun Yu

TL;DR

CliCARE tackles long-range temporal reasoning, hallucination, and evaluation challenges in using LLMs for longitudinal cancer EHRs by transforming unstructured records into Temporal Knowledge Graphs and grounding them to a guideline knowledge graph. The framework combines EHR-to-TKG transformation with trajectory-guideline alignment, employing semantic matching, LLM reranking, and bootstrapped expansion to fuse patient trajectories with normative guidelines. An Expert-Validated LLM-as-a-Judge protocol provides reliable, scalable evaluation that correlates strongly with oncologists (Spearman's ρ ≈ 0.7). Empirical results on private CancerEHR and public MIMIC-Cancer datasets show CliCARE substantially outperforms standard RAG and KG-enhanced baselines, with structured knowledge and long-context processing being key to effective decision support in oncology.

Abstract

Large Language Models (LLMs) hold significant promise for improving clinical decision support and reducing physician burnout by synthesizing complex, longitudinal cancer Electronic Health Records (EHRs). However, their implementation in this critical field faces three primary challenges: the inability to effectively process the extensive length and fragmented nature of patient records for accurate temporal analysis; a heightened risk of clinical hallucination, as conventional grounding techniques such as Retrieval-Augmented Generation (RAG) do not adequately incorporate process-oriented clinical guidelines; and unreliable evaluation metrics that hinder the validation of AI systems in oncology. To address these issues, we propose CliCARE, a framework for Grounding Large Language Models in Clinical Guidelines for Decision Support over Longitudinal Cancer Electronic Health Records. The framework operates by transforming unstructured, longitudinal EHRs into patient-specific Temporal Knowledge Graphs (TKGs) to capture long-range dependencies, and then grounding the decision support process by aligning these real-world patient trajectories with a normative guideline knowledge graph. This approach provides oncologists with evidence-grounded decision support by generating a high-fidelity clinical summary and an actionable recommendation. We validated our framework using large-scale, longitudinal data from a private Chinese cancer dataset and the public English MIMIC-IV dataset. In these settings, CliCARE significantly outperforms baselines, including leading long-context LLMs and Knowledge Graph-enhanced RAG methods. The clinical validity of our results is supported by a robust evaluation protocol, which demonstrates a high correlation with assessments made by oncologists.

CliCARE: Grounding Large Language Models in Clinical Guidelines for Decision Support over Longitudinal Cancer Electronic Health Records

TL;DR

Abstract

CliCARE: Grounding Large Language Models in Clinical Guidelines for Decision Support over Longitudinal Cancer Electronic Health Records

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (8)