Table of Contents
Fetching ...

LLM-MINE: Large Language Model based Alzheimer's Disease and Related Dementias Phenotypes Mining from Clinical Notes

Mingchen Shao, Yuzhang Xie, Carl Yang, Jiaying Lu

Abstract

Accurate extraction of Alzheimer's Disease and Related Dementias (ADRD) phenotypes from electronic health records (EHR) is critical for early-stage detection and disease staging. However, this information is usually embedded in unstructured textual data rather than tabular data, making it difficult to be extracted accurately. We therefore propose LLM-MINE, a Large Language Model-based phenotype mining framework for automatic extraction of ADRD phenotypes from clinical notes. Using two expert-defined phenotype lists, we evaluate the extracted phenotypes by examining their statistical significance across cohorts and their utility for unsupervised disease staging. Chi-square analyses confirm statistically significant phenotype differences across cohorts, with memory impairment being the strongest discriminator. Few-shot prompting with the combined phenotype lists achieves the best clustering performance (ARI=0.290, NMI=0.232), substantially outperforming biomedical NER and dictionary-based baselines. Our results demonstrate that LLM-based phenotype extraction is a promising tool for discovering clinically meaningful ADRD signals from unstructured notes.

LLM-MINE: Large Language Model based Alzheimer's Disease and Related Dementias Phenotypes Mining from Clinical Notes

Abstract

Accurate extraction of Alzheimer's Disease and Related Dementias (ADRD) phenotypes from electronic health records (EHR) is critical for early-stage detection and disease staging. However, this information is usually embedded in unstructured textual data rather than tabular data, making it difficult to be extracted accurately. We therefore propose LLM-MINE, a Large Language Model-based phenotype mining framework for automatic extraction of ADRD phenotypes from clinical notes. Using two expert-defined phenotype lists, we evaluate the extracted phenotypes by examining their statistical significance across cohorts and their utility for unsupervised disease staging. Chi-square analyses confirm statistically significant phenotype differences across cohorts, with memory impairment being the strongest discriminator. Few-shot prompting with the combined phenotype lists achieves the best clustering performance (ARI=0.290, NMI=0.232), substantially outperforming biomedical NER and dictionary-based baselines. Our results demonstrate that LLM-based phenotype extraction is a promising tool for discovering clinically meaningful ADRD signals from unstructured notes.
Paper Structure (2 sections, 3 figures, 6 tables)

This paper contains 2 sections, 3 figures, 6 tables.

Figures (3)

  • Figure 1: Overview of the LLM-MINE framework.(Left) Key challenges motivating the use of LLMs for ADRD phenotype extraction from clinical notes. (Center) The LLM-MINE pipeline: unstructured clinical notes are processed through expert-defined phenotype lists using prompting with a LLM to produce binary phenotype feature vectors. (Right) Downstream use cases including statistical correlation analysis (chi-square tests), unsupervised disease staging (k-means clustering), and baseline comparison for automated phenotype extraction.
  • Figure 2: PCA for Phenotype List 1. Phenotypes are extracted using few-shot prompting method.
  • Figure 3: PCA for Phenotype List 2. Phenotypes are extracted using few-shot prompting method.