Table of Contents
Fetching ...

Retrieval-Augmented and Knowledge-Grounded Language Models for Faithful Clinical Medicine

Fenglin Liu, Bang Yang, Chenyu You, Xian Wu, Shen Ge, Zhangdaihong Liu, Xu Sun, Yang Yang, David A. Clifton

TL;DR

The proposed Re$^3$Writer method with retrieval-augmented generation and knowledge-grounded reasoning to enable LMs to generate faithful clinical texts demonstrates the effectiveness of the method in generating patient discharge instructions.

Abstract

Language models (LMs), including large language models (such as ChatGPT), have the potential to assist clinicians in generating various clinical notes. However, LMs are prone to produce ``hallucinations'', i.e., generated content that is not aligned with facts and knowledge. In this paper, we propose the Re$^3$Writer method with retrieval-augmented generation and knowledge-grounded reasoning to enable LMs to generate faithful clinical texts. We demonstrate the effectiveness of our method in generating patient discharge instructions. It requires the LMs not to only understand the patients' long clinical documents, i.e., the health records during hospitalization, but also to generate critical instructional information provided both to carers and to the patient at the time of discharge. The proposed Re$^3$Writer imitates the working patterns of physicians to first \textbf{re}trieve related working experience from historical instructions written by physicians, then \textbf{re}ason related medical knowledge. Finally, it \textbf{re}fines the retrieved working experience and reasoned medical knowledge to extract useful information, which is used to generate the discharge instructions for previously-unseen patients. Our experiments show that, using our method, the performance of five representative LMs can be substantially boosted across all metrics. Meanwhile, we show results from human evaluations to measure the effectiveness in terms of fluency, faithfulness, and comprehensiveness.

Retrieval-Augmented and Knowledge-Grounded Language Models for Faithful Clinical Medicine

TL;DR

The proposed ReWriter method with retrieval-augmented generation and knowledge-grounded reasoning to enable LMs to generate faithful clinical texts demonstrates the effectiveness of the method in generating patient discharge instructions.

Abstract

Language models (LMs), including large language models (such as ChatGPT), have the potential to assist clinicians in generating various clinical notes. However, LMs are prone to produce ``hallucinations'', i.e., generated content that is not aligned with facts and knowledge. In this paper, we propose the ReWriter method with retrieval-augmented generation and knowledge-grounded reasoning to enable LMs to generate faithful clinical texts. We demonstrate the effectiveness of our method in generating patient discharge instructions. It requires the LMs not to only understand the patients' long clinical documents, i.e., the health records during hospitalization, but also to generate critical instructional information provided both to carers and to the patient at the time of discharge. The proposed ReWriter imitates the working patterns of physicians to first \textbf{re}trieve related working experience from historical instructions written by physicians, then \textbf{re}ason related medical knowledge. Finally, it \textbf{re}fines the retrieved working experience and reasoned medical knowledge to extract useful information, which is used to generate the discharge instructions for previously-unseen patients. Our experiments show that, using our method, the performance of five representative LMs can be substantially boosted across all metrics. Meanwhile, we show results from human evaluations to measure the effectiveness in terms of fluency, faithfulness, and comprehensiveness.
Paper Structure (16 sections, 7 equations, 4 figures, 9 tables)

This paper contains 16 sections, 7 equations, 4 figures, 9 tables.

Figures (4)

  • Figure 1: Two examples of the Patient Instruction written by physicians which guide the patients how to manage their conditions after discharge based on their health records during hospitalization.
  • Figure 2: We take the Transformer Vaswani2017Transformer as our baseline as an example to illustrate our Re$^3$Writer: Retrieve, Reason, and Refine, which is designed to first retrieve related working experience from historical PIs written by physicians and reason related medical knowledge from a medical knowledge graph; then adaptively refine and merge them to generate accurate and faithful patient instruction for current previously-unseen patient.
  • Figure 3: An example of the PI generated by baseline and our approach (i.e., baseline with Re$^3$Writer). Underlined text denotes alignment between the ground truth text and the generated text. Red colored text denotes unfavorable results. The Blue and Green colored text respectively denote the retrieved working experience and reasoned medical knowledge when generating corresponding sentences.
  • Figure 4: The constructed medical knowledge graph. Each clinical code corresponds to a node in the graph. We present the most frequent 6 diagnose nodes (the first row), 5 medication nodes (the second row), and 6 procedure nodes (the third row), and parts of their edge weights. Please refer to Table \ref{['tab:graph']} for the exact meanings of these diagnose and procedure nodes.