Cross-modal Causal Intervention for Alzheimer's Disease Prediction
Yutao Jin, Haowen Xiao, Junyong Zhai, Yuxiao Li, Jielei Chu, Fengmao Lv, Yuxiao Li
TL;DR
MediAD addresses Alzheimer's disease diagnosis under confounding from unobserved variables by grounding multi-modal predictions in a structural causal model. It introduces a cross-modal Causal Fusion module to generate a mediator from fused visual and textual features and applies a Front-Door Adjustment to mitigate confounding effects, aided by a consistency loss. Textual inputs are enriched with LLM-generated clinical summaries, creating a richer multi-modal representation alongside MRI-derived features. Experiments on NACC and ADNI show MediAD achieving state-of-the-art or competitive accuracy for CN/MCI/AD classification and CN/AD binary tasks, validating the efficacy of combining causal intervention with multi-modal learning in neurological diagnosis.
Abstract
Mild Cognitive Impairment (MCI) serves as a prodromal stage of Alzheimer's Disease (AD), where early identification and intervention can effectively slow the progression to dementia. However, diagnosing AD remains a significant challenge in neurology due to the confounders caused mainly by the selection bias of multi-modal data and the complex relationships between variables. To address these issues, we propose a novel visual-language causality-inspired framework named Cross-modal Causal Intervention with Mediator for Alzheimer's Disease Diagnosis (MediAD) for diagnostic assistance. Our MediAD employs Large Language Models (LLMs) to summarize clinical data under strict templates, therefore enriching textual inputs. The MediAD model utilizes Magnetic Resonance Imaging (MRI), clinical data, and textual data enriched by LLMs to classify participants into Cognitively Normal (CN), MCI, and AD categories. Because of the presence of confounders, such as cerebral vascular lesions and age-related biomarkers, non-causal models are likely to capture spurious input-output correlations, generating less reliable results. Our framework implicitly mitigates the effect of both observable and unobservable confounders through a unified causal intervention method. Experimental results demonstrate the outstanding performance of our method in distinguishing CN/MCI/AD cases, outperforming other methods in most evaluation metrics. The study showcases the potential of integrating causal reasoning with multi-modal learning for neurological disease diagnosis.
