Alzheimer's Disease Brain Network Mining

Alireza Moayedikia; Sara Fin

Alzheimer's Disease Brain Network Mining

Alireza Moayedikia, Sara Fin

TL;DR

The paper tackles the challenge of diagnosing Alzheimer's disease from large, heterogeneous neuroimaging datasets with limited ground-truth labels. It introduces MATCH-AD, a semi-supervised framework that unifies deep latent representation learning, graph-based label propagation, and entropically regularized optimal transport to model disease progression. The approach provides theoretical convergence guarantees and demonstrates strong empirical performance on 4,968 NACC subjects, achieving near-perfect accuracy (0.984) and Cohen’s kappa (0.962) with only 29.1% labeled data, outperforming twelve baselines. The work highlights the importance of latent-space propagation and principled progression modeling for reliable, scalable clinical deployment, and discusses deployment guidance, limitations, and future extensions. Overall, MATCH-AD shows how principled semi-supervised learning can unlock the diagnostic potential of vast unlabeled neuroimaging repositories while maintaining clinical reliability.

Abstract

Machine learning approaches for Alzheimer's disease (AD) diagnosis face a fundamental challenges. Clinical assessments are expensive and invasive, leaving ground truth labels available for only a fraction of neuroimaging datasets. We introduce Multi view Adaptive Transport Clustering for Heterogeneous Alzheimer's Disease (MATCH-AD), a semi supervised framework that integrates deep representation learning, graph-based label propagation, and optimal transport theory to address this limitation. The framework leverages manifold structure in neuroimaging data to propagate diagnostic information from limited labeled samples to larger unlabeled populations, while using Wasserstein distances to quantify disease progression between cognitive states. Evaluated on nearly five thousand subjects from the National Alzheimer's Coordinating Center, encompassing structural MRI measurements from hundreds of brain regions, cerebrospinal fluid biomarkers, and clinical variables MATCHAD achieves near-perfect diagnostic accuracy despite ground truth labels for less than one-third of subjects. The framework substantially outperforms all baseline methods, achieving kappa indicating almost perfect agreement compared to weak agreement for the best baseline, a qualitative transformation in diagnostic reliability. Performance remains clinically useful even under severe label scarcity, and we provide theoretical convergence guarantees with proven bounds on label propagation error and transport stability. These results demonstrate that principled semi-supervised learning can unlock the diagnostic potential of the vast repositories of partially annotated neuroimaging data accumulating worldwide, substantially reducing annotation burden while maintaining accuracy suitable for clinical deployment.

Alzheimer's Disease Brain Network Mining

TL;DR

Abstract

Alzheimer's Disease Brain Network Mining

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (2)

Theorems & Definitions (9)