Clustering Alzheimer's Disease Subtypes via Similarity Learning and Graph Diffusion
Tianyi Wei, Shu Yang, Davoud Ataee Tarzanagh, Jingxuan Bao, Jia Xu, Patryk Orzechowski, Joost B. Wagenaar, Qi Long, Li Shen
TL;DR
Alzheimer's disease exhibits substantial heterogeneity that complicates diagnosis and treatment. The authors develop a framework combining SIMLR, a multi-kernel similarity learning method, with graph diffusion to jointly learn subject similarities and identify subtypes from MRI-derived cortical thickness features. Applied to data from the ADNI/TADPOLE cohort, they identify five AD/MCI subtypes with distinct biomarker and cognitive profiles, supported by targeted genetic associations (e.g., INPP5D), suggesting subtype-specific etiologies. The approach outperforms traditional clustering methods, driven largely by diffusion-based noise reduction, and is made available via public code, highlighting its potential for refining AD subtyping and guiding personalized interventions.
Abstract
Alzheimer's disease (AD) is a complex neurodegenerative disorder that affects millions of people worldwide. Due to the heterogeneous nature of AD, its diagnosis and treatment pose critical challenges. Consequently, there is a growing research interest in identifying homogeneous AD subtypes that can assist in addressing these challenges in recent years. In this study, we aim to identify subtypes of AD that represent distinctive clinical features and underlying pathology by utilizing unsupervised clustering with graph diffusion and similarity learning. We adopted SIMLR, a multi-kernel similarity learning framework, and graph diffusion to perform clustering on a group of 829 patients with AD and mild cognitive impairment (MCI, a prodromal stage of AD) based on their cortical thickness measurements extracted from magnetic resonance imaging (MRI) scans. Although the clustering approach we utilized has not been explored for the task of AD subtyping before, it demonstrated significantly better performance than several commonly used clustering methods. Specifically, we showed the power of graph diffusion in reducing the effects of noise in the subtype detection. Our results revealed five subtypes that differed remarkably in their biomarkers, cognitive status, and some other clinical features. To evaluate the resultant subtypes further, a genetic association study was carried out and successfully identified potential genetic underpinnings of different AD subtypes. Our source code is available at: https://github.com/PennShenLab/AD-SIMLR.
