medIKAL: Integrating Knowledge Graphs as Assistants of LLMs for Enhanced Clinical Diagnosis on EMRs

Mingyi Jia; Junwen Duan; Yan Song; Jianxin Wang

medIKAL: Integrating Knowledge Graphs as Assistants of LLMs for Enhanced Clinical Diagnosis on EMRs

Mingyi Jia, Junwen Duan, Yan Song, Jianxin Wang

TL;DR

medIKAL tackles the challenge of EMR-based clinical diagnosis by marrying LLM capabilities with a weighted knowledge-graph search. It introduces weighted entity-type scores $w_{t}$ to localize candidate diseases in the KG, and uses a residual-like integration to blend the LLM’s initial diagnosis with KG-derived candidates, followed by a path-based reranking using the shortest-path distance $ ext{dist}( ext{D}_i,e_j)$. A KG knowledge reconstruction step yields a semi-structured input for the LLM via fill-in-the-blank prompts, guided by a threshold $\theta$ (set to 60% of the total score) to decide final diagnoses. Experiments on the open Chinese EMR dataset CMEMR and supplementary EMR datasets show that medIKAL outperforms strong baselines and demonstrates robustness across backbones and data conditions, highlighting its potential for practical AI-assisted clinical diagnosis while noting limitations related to data sparsity and numerical indicator handling.

Abstract

Electronic Medical Records (EMRs), while integral to modern healthcare, present challenges for clinical reasoning and diagnosis due to their complexity and information redundancy. To address this, we proposed medIKAL (Integrating Knowledge Graphs as Assistants of LLMs), a framework that combines Large Language Models (LLMs) with knowledge graphs (KGs) to enhance diagnostic capabilities. medIKAL assigns weighted importance to entities in medical records based on their type, enabling precise localization of candidate diseases within KGs. It innovatively employs a residual network-like approach, allowing initial diagnosis by the LLM to be merged into KG search results. Through a path-based reranking algorithm and a fill-in-the-blank style prompt template, it further refined the diagnostic process. We validated medIKAL's effectiveness through extensive experiments on a newly introduced open-sourced Chinese EMR dataset, demonstrating its potential to improve clinical diagnosis in real-world settings.

medIKAL: Integrating Knowledge Graphs as Assistants of LLMs for Enhanced Clinical Diagnosis on EMRs

TL;DR

medIKAL tackles the challenge of EMR-based clinical diagnosis by marrying LLM capabilities with a weighted knowledge-graph search. It introduces weighted entity-type scores

to localize candidate diseases in the KG, and uses a residual-like integration to blend the LLM’s initial diagnosis with KG-derived candidates, followed by a path-based reranking using the shortest-path distance

. A KG knowledge reconstruction step yields a semi-structured input for the LLM via fill-in-the-blank prompts, guided by a threshold

(set to 60% of the total score) to decide final diagnoses. Experiments on the open Chinese EMR dataset CMEMR and supplementary EMR datasets show that medIKAL outperforms strong baselines and demonstrates robustness across backbones and data conditions, highlighting its potential for practical AI-assisted clinical diagnosis while noting limitations related to data sparsity and numerical indicator handling.

Abstract

Paper Structure (38 sections, 5 equations, 7 figures, 14 tables, 2 algorithms)

This paper contains 38 sections, 5 equations, 7 figures, 14 tables, 2 algorithms.

Introduction
Related Work
Clinical Diagnosis and Prediction on EMRs
Knowledge Graphs Augmented LLMs
Method
EMR Summarisation and Direct Diagnosis via LLMs
Candidate Disease Localization and Reranking via KG
Entity Recognition and Matching
Candidate Disease Localization Based on Entity-Type Weights
Candidate Disease Reranking Based on Paths.
Collaborative Reasoning between LLMs and KG Knowledge
Reconstruction of KG Knowledge
Clinical Reasoning and Diagnosis Based on Fill-in-the-Blank Prompt Templates
Experiments
Experimental Setup
...and 23 more sections

Figures (7)

Figure 1: Limitations of existing methods using KG-augmented LLMs for application to EMR diagnostic tasks. ① use subgraphs/triplets to augment context.② use reasoning chains to augment context. ③ use the iteration-based approach to involve LLMs in KG searching and reasoning.
Figure 2: The overall workflow of medIKAL. It contains three main modules, namely: Module 1. preprocess before KG search (A, B, and C.1); Module 2. Candidate Disease Localization and Reranking via KG (C.2 and D); Module 3. Collaborative Reasoning for LLMs and KG (E).
Figure 3: An illustration of how to combine reranking process with the knowledge construction process.
Figure 4: Evaluation results for medIKAL and other baseline methods' capabilities of utilizing LLM's internal knowledge. "Retained" denotes that the useful diagnoses from LLM's original predictions are kept as final results, and "Lost" denotes the opposite.
Figure 5: A data example from CMEMR.
...and 2 more figures

medIKAL: Integrating Knowledge Graphs as Assistants of LLMs for Enhanced Clinical Diagnosis on EMRs

TL;DR

Abstract

medIKAL: Integrating Knowledge Graphs as Assistants of LLMs for Enhanced Clinical Diagnosis on EMRs

Authors

TL;DR

Abstract

Table of Contents

Figures (7)