MRI Embeddings Complement Clinical Predictors for Cognitive Decline Modeling in Alzheimer's Disease Cohorts
Nathaniel Putera, Daniel Vilet Rodríguez, Noah Videcrantz, Julia Machnio, Mostafa Mehdipour Ghazi
TL;DR
This paper addresses the challenge of predicting heterogeneous cognitive decline in Alzheimer's disease by comparing tabular clinical predictors with transformer-derived MRI embeddings. It introduces trajectory labeling via Dynamic Time Warping clustering and trains a 3D Vision Transformer via unsupervised MRI reconstruction to produce anatomy-preserving embeddings, which are then evaluated against tabular data and CNN baselines across four decline classes. The study demonstrates complementary strengths: tabular features excel at identifying mild and severe decline, while MRI embeddings best identify cognitively stable trajectories, motivating multimodal fusion for robust progression modeling. The findings have implications for early stratification and personalized management, suggesting integrated models that combine clinical risk markers with image-derived representations can improve AD progression prediction.
Abstract
Accurate modeling of cognitive decline in Alzheimer's disease is essential for early stratification and personalized management. While tabular predictors provide robust markers of global risk, their ability to capture subtle brain changes remains limited. In this study, we evaluate the predictive contributions of tabular and imaging-based representations, with a focus on transformer-derived Magnetic Resonance Imaging (MRI) embeddings. We introduce a trajectory-aware labeling strategy based on Dynamic Time Warping clustering to capture heterogeneous patterns of cognitive change, and train a 3D Vision Transformer (ViT) via unsupervised reconstruction on harmonized and augmented MRI data to obtain anatomy-preserving embeddings without progression labels. The pretrained encoder embeddings are subsequently assessed using both traditional machine learning classifiers and deep learning heads, and compared against tabular representations and convolutional network baselines. Results highlight complementary strengths across modalities. Clinical and volumetric features achieved the highest AUCs of around 0.70 for predicting mild and severe progression, underscoring their utility in capturing global decline trajectories. In contrast, MRI embeddings from the ViT model were most effective in distinguishing cognitively stable individuals with an AUC of 0.71. However, all approaches struggled in the heterogeneous moderate group. These findings indicate that clinical features excel in identifying high-risk extremes, whereas transformer-based MRI embeddings are more sensitive to subtle markers of stability, motivating multimodal fusion strategies for AD progression modeling.
