Augmented Risk Prediction for the Onset of Alzheimer's Disease from Electronic Health Records with Large Language Models
Jiankun Wang, Sumyeong Ahn, Taykhoom Dalal, Xiaodan Zhang, Weishen Pan, Qiannan Zhang, Bin Chen, Hiroko H. Dodge, Fei Wang, Jiayu Zhou
TL;DR
This work tackles ADRD risk prediction from electronic health records by proposing a collaborative pipeline that marries supervised learning with large language models through a confidence-guided routing mechanism. The method summarizes tabular EHR data into natural language, trains SLs on the data, and uses SLs for confident cases while employing LLMs with in-context learning for uncertain cases, with ICL demonstrations drawn from a reliable subset. Key contributions include a detailed data construction pipeline from the OHSU EHR warehouse, a principled confidence-driven decision rule using $\sigma$, and extensive ablations demonstrating the benefits of EHR summarization, similarity-based demonstration retrieval, and denoising strategies, along with insights that larger or medical-fine-tuned models do not uniformly improve performance. Empirical results on six CP_PW configurations show improved F1 scores over baselines, suggesting practical value for early ADRD screening and patient management, though overall performance remains challenging due to missing clinical notes and demographic data. The work highlights the potential of combining SLs and LLMs for healthcare screening applications and motivates further exploration of model selection, representation of structured medical data for reasoning, and scalable ICL strategies in clinical domains.
Abstract
Alzheimer's disease (AD) is the fifth-leading cause of death among Americans aged 65 and older. Screening and early detection of AD and related dementias (ADRD) are critical for timely intervention and for identifying clinical trial participants. The widespread adoption of electronic health records (EHRs) offers an important resource for developing ADRD screening tools such as machine learning based predictive models. Recent advancements in large language models (LLMs) demonstrate their unprecedented capability of encoding knowledge and performing reasoning, which offers them strong potential for enhancing risk prediction. This paper proposes a novel pipeline that augments risk prediction by leveraging the few-shot inference power of LLMs to make predictions on cases where traditional supervised learning methods (SLs) may not excel. Specifically, we develop a collaborative pipeline that combines SLs and LLMs via a confidence-driven decision-making mechanism, leveraging the strengths of SLs in clear-cut cases and LLMs in more complex scenarios. We evaluate this pipeline using a real-world EHR data warehouse from Oregon Health \& Science University (OHSU) Hospital, encompassing EHRs from over 2.5 million patients and more than 20 million patient encounters. Our results show that our proposed approach effectively combines the power of SLs and LLMs, offering significant improvements in predictive performance. This advancement holds promise for revolutionizing ADRD screening and early detection practices, with potential implications for better strategies of patient management and thus improving healthcare.
