Table of Contents
Fetching ...

Can ChatGPT Diagnose Alzheimer's Disease?

Quoc-Toan Nguyen, Linh Le, Xuan-The Tran, Thomas Do, Chin-Teng Lin

TL;DR

This study investigates Can ChatGPT diagnose Alzheimer's Disease by evaluating zero-shot and multi-shot prompting on 9,300 EHRs derived from the ADNI dataset, incorporating MRI volumetrics and cognitive test scores. Using GPT-4-turbo in a black-box framework, it demonstrates that multimodal data substantially improves diagnostic accuracy and calibration, with multi-shot prompting achieving up to $0.946$ accuracy at a 75% confidence threshold when MRI and cognitive data are combined. The findings suggest ChatGPT can serve as a supportive diagnostic aid, particularly in settings with limited access to AD specialists, while also emphasizing the need for larger, diverse datasets and fairness assessments to ensure robust, equitable deployment. The work lays groundwork for integrating LLM-based diagnosis into clinical workflows and prompts future comparisons with alternative models to validate relative performance.

Abstract

Can ChatGPT diagnose Alzheimer's Disease (AD)? AD is a devastating neurodegenerative condition that affects approximately 1 in 9 individuals aged 65 and older, profoundly impairing memory and cognitive function. This paper utilises 9300 electronic health records (EHRs) with data from Magnetic Resonance Imaging (MRI) and cognitive tests to address an intriguing question: As a general-purpose task solver, can ChatGPT accurately detect AD using EHRs? We present an in-depth evaluation of ChatGPT using a black-box approach with zero-shot and multi-shot methods. This study unlocks ChatGPT's capability to analyse MRI and cognitive test results, as well as its potential as a diagnostic tool for AD. By automating aspects of the diagnostic process, this research opens a transformative approach for the healthcare system, particularly in addressing disparities in resource-limited regions where AD specialists are scarce. Hence, it offers a foundation for a promising method for early detection, supporting individuals with timely interventions, which is paramount for Quality of Life (QoL).

Can ChatGPT Diagnose Alzheimer's Disease?

TL;DR

This study investigates Can ChatGPT diagnose Alzheimer's Disease by evaluating zero-shot and multi-shot prompting on 9,300 EHRs derived from the ADNI dataset, incorporating MRI volumetrics and cognitive test scores. Using GPT-4-turbo in a black-box framework, it demonstrates that multimodal data substantially improves diagnostic accuracy and calibration, with multi-shot prompting achieving up to accuracy at a 75% confidence threshold when MRI and cognitive data are combined. The findings suggest ChatGPT can serve as a supportive diagnostic aid, particularly in settings with limited access to AD specialists, while also emphasizing the need for larger, diverse datasets and fairness assessments to ensure robust, equitable deployment. The work lays groundwork for integrating LLM-based diagnosis into clinical workflows and prompts future comparisons with alternative models to validate relative performance.

Abstract

Can ChatGPT diagnose Alzheimer's Disease (AD)? AD is a devastating neurodegenerative condition that affects approximately 1 in 9 individuals aged 65 and older, profoundly impairing memory and cognitive function. This paper utilises 9300 electronic health records (EHRs) with data from Magnetic Resonance Imaging (MRI) and cognitive tests to address an intriguing question: As a general-purpose task solver, can ChatGPT accurately detect AD using EHRs? We present an in-depth evaluation of ChatGPT using a black-box approach with zero-shot and multi-shot methods. This study unlocks ChatGPT's capability to analyse MRI and cognitive test results, as well as its potential as a diagnostic tool for AD. By automating aspects of the diagnostic process, this research opens a transformative approach for the healthcare system, particularly in addressing disparities in resource-limited regions where AD specialists are scarce. Hence, it offers a foundation for a promising method for early detection, supporting individuals with timely interventions, which is paramount for Quality of Life (QoL).

Paper Structure

This paper contains 14 sections, 10 equations, 5 figures, 5 tables.

Figures (5)

  • Figure 1: The Workflow of Exploring ChatGPT’s Potential in Diagnosing AD. $P_{ZERO}$ and $P_{MULTI}$ are The Prompts for having the Predictive Results.
  • Figure 2: Visualisation of Performance Metrics of Zero-Shot and Multi-Shot Prompting with ChatGPT for Detecting AD.
  • Figure 3: Visualisation of Calibration Metrics of Zero-Shot and Multi-Shot Prompting with ChatGPT for Detecting AD.
  • Figure 4: Accurate Samples with Different Thresholds from Zero-Shot and Multi-Shot Prompting for Detecting AD.
  • Figure 5: Accurate Samples with Confidence Scores (%) Distribution from Zero-Shot and Multi-Shot Prompting for Detecting AD. The blue line represents the average, while the green dashed line is the median.