README: Bridging Medical Jargon and Lay Understanding for Patient Education through Data-Centric NLP

Zonghai Yao; Nandyala Siddharth Kantu; Guanghao Wei; Hieu Tran; Zhangqi Duan; Sunjae Kwon; Zhichao Yang; README annotation team; Hong Yu

README: Bridging Medical Jargon and Lay Understanding for Patient Education through Data-Centric NLP

Zonghai Yao, Nandyala Siddharth Kantu, Guanghao Wei, Hieu Tran, Zhangqi Duan, Sunjae Kwon, Zhichao Yang, README annotation team, Hong Yu

TL;DR

This research introduces a new task of automatically generating lay definitions, aiming to simplify complex medical terms into patient-friendly lay language, and demonstrates that open-source mobile-friendly models, when fine-tuned with high-quality data, are capable of matching or even surpassing the performance of state-of-the-art closed-source large language models like ChatGPT.

Abstract

The advancement in healthcare has shifted focus toward patient-centric approaches, particularly in self-care and patient education, facilitated by access to Electronic Health Records (EHR). However, medical jargon in EHRs poses significant challenges in patient comprehension. To address this, we introduce a new task of automatically generating lay definitions, aiming to simplify complex medical terms into patient-friendly lay language. We first created the README dataset, an extensive collection of over 50,000 unique (medical term, lay definition) pairs and 300,000 mentions, each offering context-aware lay definitions manually annotated by domain experts. We have also engineered a data-centric Human-AI pipeline that synergizes data filtering, augmentation, and selection to improve data quality. We then used README as the training data for models and leveraged a Retrieval-Augmented Generation method to reduce hallucinations and improve the quality of model outputs. Our extensive automatic and human evaluations demonstrate that open-source mobile-friendly models, when fine-tuned with high-quality data, are capable of matching or even surpassing the performance of state-of-the-art closed-source large language models like ChatGPT. This research represents a significant stride in closing the knowledge gap in patient education and advancing patient-centric healthcare solutions.

README: Bridging Medical Jargon and Lay Understanding for Patient Education through Data-Centric NLP

TL;DR

Abstract

Paper Structure (39 sections, 2 equations, 7 figures, 8 tables, 1 algorithm)

This paper contains 39 sections, 2 equations, 7 figures, 8 tables, 1 algorithm.

Introduction
Problem Statement
Method
README Data Collection
Lay Definition Annotation
General Definition Retrieval
Examiner-Augmenter-Examiner (EAE)
Examiner (expert-annotated data)
Augmenter
Examiner (AI-synthetic data)
Qaulty Checking
Integration of Synthetic and Expert Data
Experiments
Automatic Evaluation Metrics
Experimental Setting
...and 24 more sections

Figures (7)

Figure 1: A visualization of the NoteAid pipeline, where NLP tools first identify jargon that may be challenging for patients to understand. The lay definitions corresponding to these jargon terms are then retrieved from relevant dictionaries and presented to the patients, enhancing their comprehension and engagement with their health information.
Figure 2: Our Data-Centric NLP pipeline, comprising the Examiner-Augmenter-Examiner (EAE) framework and different data selection methods. EAE shows how humans (physicians) and AI (LLM, e.g. ChatGPT) cooperate to make a high-quality README dataset. We collect general definitions for every jargon term from external knowledge resources such as UMLS. "R’' is "README’'. "exp’' is "expert annotation version’', "syn’' is "AI synthetic version’'. "instruction’' and "demo’' (examples for in-context learning) are combined into the prompt for LLM. In the pipeline, the human duties at different stages are annotator (labeling the initial dataset) and instructor (providing suitable prompts to guide AI at every stage). The AI duties at different stages are examiner (filter high-quality data) and augmenter (improve the quality of low-quality data). Appendix Table \ref{['tab:dataset statistics']} describes the number of different versions of the README dataset in each step. After we get two high-quality datasets, R-exp_good and R-syn_good, from the EAE pipeline, we then deploy 4 different data selection strategies to combine high-quality expert-annotated data R-exp_good and high-quality AI-synthetic data R-syn_bad for in-house system training.
Figure 3: One-shot performances on jargon2lay.
Figure 4: Comparative performance analysis of DistilGPT2, BioGPT, and LLAMA2 against GPT-3.5-turbo.
Figure 5: Human evaluation results (win rate %).
...and 2 more figures

README: Bridging Medical Jargon and Lay Understanding for Patient Education through Data-Centric NLP

TL;DR

Abstract

README: Bridging Medical Jargon and Lay Understanding for Patient Education through Data-Centric NLP

Authors

TL;DR

Abstract

Table of Contents

Figures (7)