Table of Contents
Fetching ...

Enhancing Clinical Note Generation with ICD-10, Clinical Ontology Knowledge Graphs, and Chain-of-Thought Prompting Using GPT-4

Ivan Makohon, Mohamad Najafi, Jian Wu, Mathias Brochhausen, Yaohang Li

TL;DR

The paper addresses the burden of clinical note generation by combining ICD-10 code prompts with Chain-of-Thought reasoning, semantic search, and SNOMED CT-based knowledge graphs to guide GPT-4 in producing HPI notes. It demonstrates that CoT prompting with semantically retrieved clinical case exemplars improves semantic alignment with ground truth compared to a baseline one-shot prompt, across six CodiEsp cases. Knowledge-graph augmentation increases output variability and does not consistently improve precision, highlighting the need for careful KG integration and human evaluation. The work contributes a reproducible prompting framework and a public code repository, with implications for reducing documentation burden and guiding safe AI-assisted clinical documentation.

Abstract

In the past decade a surge in the amount of electronic health record (EHR) data in the United States, attributed to a favorable policy environment created by the Health Information Technology for Economic and Clinical Health (HITECH) Act of 2009 and the 21st Century Cures Act of 2016. Clinical notes for patients' assessments, diagnoses, and treatments are captured in these EHRs in free-form text by physicians, who spend a considerable amount of time entering and editing them. Manually writing clinical notes takes a considerable amount of a doctor's valuable time, increasing the patient's waiting time and possibly delaying diagnoses. Large language models (LLMs) possess the ability to generate news articles that closely resemble human-written ones. We investigate the usage of Chain-of-Thought (CoT) prompt engineering to improve the LLM's response in clinical note generation. In our prompts, we use as input International Classification of Diseases (ICD) codes and basic patient information. We investigate a strategy that combines the traditional CoT with semantic search results to improve the quality of generated clinical notes. Additionally, we infuse a knowledge graph (KG) built from clinical ontology to further enrich the domain-specific knowledge of generated clinical notes. We test our prompting technique on six clinical cases from the CodiEsp test dataset using GPT-4 and our results show that it outperformed the clinical notes generated by standard one-shot prompts.

Enhancing Clinical Note Generation with ICD-10, Clinical Ontology Knowledge Graphs, and Chain-of-Thought Prompting Using GPT-4

TL;DR

The paper addresses the burden of clinical note generation by combining ICD-10 code prompts with Chain-of-Thought reasoning, semantic search, and SNOMED CT-based knowledge graphs to guide GPT-4 in producing HPI notes. It demonstrates that CoT prompting with semantically retrieved clinical case exemplars improves semantic alignment with ground truth compared to a baseline one-shot prompt, across six CodiEsp cases. Knowledge-graph augmentation increases output variability and does not consistently improve precision, highlighting the need for careful KG integration and human evaluation. The work contributes a reproducible prompting framework and a public code repository, with implications for reducing documentation burden and guiding safe AI-assisted clinical documentation.

Abstract

In the past decade a surge in the amount of electronic health record (EHR) data in the United States, attributed to a favorable policy environment created by the Health Information Technology for Economic and Clinical Health (HITECH) Act of 2009 and the 21st Century Cures Act of 2016. Clinical notes for patients' assessments, diagnoses, and treatments are captured in these EHRs in free-form text by physicians, who spend a considerable amount of time entering and editing them. Manually writing clinical notes takes a considerable amount of a doctor's valuable time, increasing the patient's waiting time and possibly delaying diagnoses. Large language models (LLMs) possess the ability to generate news articles that closely resemble human-written ones. We investigate the usage of Chain-of-Thought (CoT) prompt engineering to improve the LLM's response in clinical note generation. In our prompts, we use as input International Classification of Diseases (ICD) codes and basic patient information. We investigate a strategy that combines the traditional CoT with semantic search results to improve the quality of generated clinical notes. Additionally, we infuse a knowledge graph (KG) built from clinical ontology to further enrich the domain-specific knowledge of generated clinical notes. We test our prompting technique on six clinical cases from the CodiEsp test dataset using GPT-4 and our results show that it outperformed the clinical notes generated by standard one-shot prompts.

Paper Structure

This paper contains 18 sections, 1 equation, 18 figures, 3 tables.

Figures (18)

  • Figure 1: An illustration of the Semantic Search query used to search for similarities within the embedding space.
  • Figure 2: An example of the OWL expression prompt for CoT KG prompting.
  • Figure 3: Baseline (One-Shot) Prompt.
  • Figure 4: CoT Prompt (leveraging the ICD Code semantic search query).
  • Figure 5: CoT Prompt (leveraging the knowledge graph).
  • ...and 13 more figures