OPTIMUS: Predicting Multivariate Outcomes in Alzheimer's Disease Using Multi-modal Data amidst Missing Values

Christelle Schneuwly Diaz; Duy-Thanh Vu; Julien Bodelet; Duy-Cat Can; Guillaume Blanc; Haiting Jiang; Lin Yao; Guiseppe Pantaleo; ADNI; Oliver Y. Chén

OPTIMUS: Predicting Multivariate Outcomes in Alzheimer's Disease Using Multi-modal Data amidst Missing Values

Christelle Schneuwly Diaz, Duy-Thanh Vu, Julien Bodelet, Duy-Cat Can, Guillaume Blanc, Haiting Jiang, Lin Yao, Guiseppe Pantaleo, ADNI, Oliver Y. Chén

TL;DR

OPTIMUS tackles the many-to-many challenge of predicting multivariate Alzheimer's disease outcomes from multimodal data with missing values. It integrates modality-specific missing-data imputation, a TabNet-based multivariate predictor, and permutation-based explainability to map biomarkers to four cognitive domains. The framework identifies differential, biologically meaningful biomarkers across MRI, CSF, and transcriptomic data, with APOE $\epsilon$4 repeatedly emerging as a key predictor and imaging features localized to anatomically plausible regions. The results demonstrate improved predictive accuracy with multimodal data over any single modality and showcase the potential for interpretable, mechanistic insight into AD progression that could inform clinical decision-making.

Abstract

Alzheimer's disease, a neurodegenerative disorder, is associated with neural, genetic, and proteomic factors while affecting multiple cognitive and behavioral faculties. Traditional AD prediction largely focuses on univariate disease outcomes, such as disease stages and severity. Multimodal data encode broader disease information than a single modality and may, therefore, improve disease prediction; but they often contain missing values. Recent "deeper" machine learning approaches show promise in improving prediction accuracy, yet the biological relevance of these models needs to be further charted. Integrating missing data analysis, predictive modeling, multimodal data analysis, and explainable AI, we propose OPTIMUS, a predictive, modular, and explainable machine learning framework, to unveil the many-to-many predictive pathways between multimodal input data and multivariate disease outcomes amidst missing values. OPTIMUS first applies modality-specific imputation to uncover data from each modality while optimizing overall prediction accuracy. It then maps multimodal biomarkers to multivariate outcomes using machine-learning and extracts biomarkers respectively predictive of each outcome. Finally, OPTIMUS incorporates XAI to explain the identified multimodal biomarkers. Using data from 346 cognitively normal subjects, 608 persons with mild cognitive impairment, and 251 AD patients, OPTIMUS identifies neural and transcriptomic signatures that jointly but differentially predict multivariate outcomes related to executive function, language, memory, and visuospatial function. Our work demonstrates the potential of building a predictive and biologically explainable machine-learning framework to uncover multimodal biomarkers that capture disease profiles across varying cognitive landscapes. The results improve our understanding of the complex many-to-many pathways in AD.

OPTIMUS: Predicting Multivariate Outcomes in Alzheimer's Disease Using Multi-modal Data amidst Missing Values

TL;DR

Abstract

OPTIMUS: Predicting Multivariate Outcomes in Alzheimer's Disease Using Multi-modal Data amidst Missing Values

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (18)