Table of Contents
Fetching ...

EMERGE: Enhancing Multimodal Electronic Health Records Predictive Modeling with Retrieval-Augmented Generation

Yinghao Zhu, Changyu Ren, Zixiang Wang, Xiaochen Zheng, Shiyun Xie, Junlan Feng, Xi Zhu, Zhoujun Li, Liantao Ma, Chengwei Pan

TL;DR

EMERGE presents a Retrieval-Augmented Generation framework that enhances multimodal EHR predictive modeling by integrating time-series data, clinical notes, and a professional knowledge graph (PrimeKG). By prompting LLMs to extract entities, aligning them with the KG to mitigate hallucinations, and distilling this information into task-focused summaries, EMERGE produces a rich, knowledge-grounded representation that is fused with other modalities via a cross-attention network. The approach achieves state-of-the-art performance on the MIMIC-III and MIMIC-IV benchmarks for in-hospital mortality and 30-day readmission, with extensive ablations confirming the contribution of each component and robustness to data sparsity. The work offers a practical, scalable pathway for leveraging external medical knowledge and LLM reasoning in clinical prediction, accompanied by publicly available code.

Abstract

The integration of multimodal Electronic Health Records (EHR) data has significantly advanced clinical predictive capabilities. Existing models, which utilize clinical notes and multivariate time-series EHR data, often fall short of incorporating the necessary medical context for accurate clinical tasks, while previous approaches with knowledge graphs (KGs) primarily focus on structured knowledge extraction. In response, we propose EMERGE, a Retrieval-Augmented Generation (RAG) driven framework to enhance multimodal EHR predictive modeling. We extract entities from both time-series data and clinical notes by prompting Large Language Models (LLMs) and align them with professional PrimeKG, ensuring consistency. In addition to triplet relationships, we incorporate entities' definitions and descriptions for richer semantics. The extracted knowledge is then used to generate task-relevant summaries of patients' health statuses. Finally, we fuse the summary with other modalities using an adaptive multimodal fusion network with cross-attention. Extensive experiments on the MIMIC-III and MIMIC-IV datasets' in-hospital mortality and 30-day readmission tasks demonstrate the superior performance of the EMERGE framework over baseline models. Comprehensive ablation studies and analysis highlight the efficacy of each designed module and robustness to data sparsity. EMERGE contributes to refining the utilization of multimodal EHR data in healthcare, bridging the gap with nuanced medical contexts essential for informed clinical predictions. We have publicly released the code at https://github.com/yhzhu99/EMERGE.

EMERGE: Enhancing Multimodal Electronic Health Records Predictive Modeling with Retrieval-Augmented Generation

TL;DR

EMERGE presents a Retrieval-Augmented Generation framework that enhances multimodal EHR predictive modeling by integrating time-series data, clinical notes, and a professional knowledge graph (PrimeKG). By prompting LLMs to extract entities, aligning them with the KG to mitigate hallucinations, and distilling this information into task-focused summaries, EMERGE produces a rich, knowledge-grounded representation that is fused with other modalities via a cross-attention network. The approach achieves state-of-the-art performance on the MIMIC-III and MIMIC-IV benchmarks for in-hospital mortality and 30-day readmission, with extensive ablations confirming the contribution of each component and robustness to data sparsity. The work offers a practical, scalable pathway for leveraging external medical knowledge and LLM reasoning in clinical prediction, accompanied by publicly available code.

Abstract

The integration of multimodal Electronic Health Records (EHR) data has significantly advanced clinical predictive capabilities. Existing models, which utilize clinical notes and multivariate time-series EHR data, often fall short of incorporating the necessary medical context for accurate clinical tasks, while previous approaches with knowledge graphs (KGs) primarily focus on structured knowledge extraction. In response, we propose EMERGE, a Retrieval-Augmented Generation (RAG) driven framework to enhance multimodal EHR predictive modeling. We extract entities from both time-series data and clinical notes by prompting Large Language Models (LLMs) and align them with professional PrimeKG, ensuring consistency. In addition to triplet relationships, we incorporate entities' definitions and descriptions for richer semantics. The extracted knowledge is then used to generate task-relevant summaries of patients' health statuses. Finally, we fuse the summary with other modalities using an adaptive multimodal fusion network with cross-attention. Extensive experiments on the MIMIC-III and MIMIC-IV datasets' in-hospital mortality and 30-day readmission tasks demonstrate the superior performance of the EMERGE framework over baseline models. Comprehensive ablation studies and analysis highlight the efficacy of each designed module and robustness to data sparsity. EMERGE contributes to refining the utilization of multimodal EHR data in healthcare, bridging the gap with nuanced medical contexts essential for informed clinical predictions. We have publicly released the code at https://github.com/yhzhu99/EMERGE.
Paper Structure (37 sections, 14 equations, 11 figures, 3 tables)

This paper contains 37 sections, 14 equations, 11 figures, 3 tables.

Figures (11)

  • Figure 1: Overall architecture of our proposed EMERGE framework. The modules enclosed within the dashed box illustrate the RAG-driven enhancement pipeline. "LM" denotes Language Model (basically BERT-based model), while "LLM" in this paper normally refers to the GPT-based Large Language Model.
  • Figure 2: Process of information retrieval for time-series data.
  • Figure 3: Prompt template for extracting entities.
  • Figure 4: Process of information retrieval for textual clinical notes. The grey block in potential diseases means no corresponding node found in external KG.
  • Figure 5: Prompt template for summary generation.
  • ...and 6 more figures