Table of Contents
Fetching ...

Multimodal Neurodegenerative Disease Subtyping Explained by ChatGPT

Diego Machado Reyes, Hanqing Chao, Juergen Hahn, Li Shen, Pingkun Yan

TL;DR

The paper tackles early subtyping of Alzheimer's disease by integrating imaging, genetic, and clinical data through a novel tri-modal co-attention mechanism (Tri-COAT). It combines modality-specific encoders with a cross-modal attention module to produce discriminative representations for three subtypes defined by baseline MMSE trajectories, achieving a mean AUROC of approximately $0.733$ on ADNI data and outperforming baselines. A key innovation is the use of prompt-based explanations and integrated-gradients to generate interpretable narratives from large language models (ChatGPT and Bard), linking cross-modal biomarker associations to known biology while acknowledging LLM limitations. The work supports the potential for early, explainable, multimodal subtyping with implications for personalized intervention and paves the way for extensions to other neurodegenerative diseases and broader clinical workflows.

Abstract

Alzheimer's disease (AD) is the most prevalent neurodegenerative disease; yet its currently available treatments are limited to stopping disease progression. Moreover, effectiveness of these treatments is not guaranteed due to the heterogenetiy of the disease. Therefore, it is essential to be able to identify the disease subtypes at a very early stage. Current data driven approaches are able to classify the subtypes at later stages of AD or related disorders, but struggle when predicting at the asymptomatic or prodromal stage. Moreover, most existing models either lack explainability behind the classification or only use a single modality for the assessment, limiting scope of its analysis. Thus, we propose a multimodal framework that uses early-stage indicators such as imaging, genetics and clinical assessments to classify AD patients into subtypes at early stages. Similarly, we build prompts and use large language models, such as ChatGPT, to interpret the findings of our model. In our framework, we propose a tri-modal co-attention mechanism (Tri-COAT) to explicitly learn the cross-modal feature associations. Our proposed model outperforms baseline models and provides insight into key cross-modal feature associations supported by known biological mechanisms.

Multimodal Neurodegenerative Disease Subtyping Explained by ChatGPT

TL;DR

The paper tackles early subtyping of Alzheimer's disease by integrating imaging, genetic, and clinical data through a novel tri-modal co-attention mechanism (Tri-COAT). It combines modality-specific encoders with a cross-modal attention module to produce discriminative representations for three subtypes defined by baseline MMSE trajectories, achieving a mean AUROC of approximately on ADNI data and outperforming baselines. A key innovation is the use of prompt-based explanations and integrated-gradients to generate interpretable narratives from large language models (ChatGPT and Bard), linking cross-modal biomarker associations to known biology while acknowledging LLM limitations. The work supports the potential for early, explainable, multimodal subtyping with implications for personalized intervention and paves the way for extensions to other neurodegenerative diseases and broader clinical workflows.

Abstract

Alzheimer's disease (AD) is the most prevalent neurodegenerative disease; yet its currently available treatments are limited to stopping disease progression. Moreover, effectiveness of these treatments is not guaranteed due to the heterogenetiy of the disease. Therefore, it is essential to be able to identify the disease subtypes at a very early stage. Current data driven approaches are able to classify the subtypes at later stages of AD or related disorders, but struggle when predicting at the asymptomatic or prodromal stage. Moreover, most existing models either lack explainability behind the classification or only use a single modality for the assessment, limiting scope of its analysis. Thus, we propose a multimodal framework that uses early-stage indicators such as imaging, genetics and clinical assessments to classify AD patients into subtypes at early stages. Similarly, we build prompts and use large language models, such as ChatGPT, to interpret the findings of our model. In our framework, we propose a tri-modal co-attention mechanism (Tri-COAT) to explicitly learn the cross-modal feature associations. Our proposed model outperforms baseline models and provides insight into key cross-modal feature associations supported by known biological mechanisms.
Paper Structure (16 sections, 2 equations, 5 figures, 3 tables)

This paper contains 16 sections, 2 equations, 5 figures, 3 tables.

Figures (5)

  • Figure 1: The three main multimodal fusion strategies, early, intermediate and late fusion, for deep learning methods.
  • Figure 2: Illustration of the proposed framework for AD subtyping, consisting of two main sections: (a) single modality encoding and (b) tri-modal attention and joint encoding.
  • Figure 3: AD subtype clusters based on the decrease of MMSE at each visit. (a) each line represents the average score across patients for each cluster and the shadow represents one standard deviation. (b) individual lines per patient are plotted.
  • Figure 4: AD key biomarker associations form learned co-attention.
  • Figure 5: Tri-COAT interpretability through ChatGPT. Prompts were built based on Tri-COAT's prediction and integrated gradients feature attributions for each patient. Example portions of the prompts are showing in the blue text bubbles and ChatGPT answers are shown in the green text bubbles.