Table of Contents
Fetching ...

Explainable Graph-theoretical Machine Learning: with Application to Alzheimer's Disease Prediction

Narmina Baghirova, Duy-Thanh Vũ, Duy-Cat Can, Christelle Schneuwly Diaz, Julien Bodlet, Guillaume Blanc, Georgi Hrusanov, Bernard Ries, Oliver Y. Chén

TL;DR

XGML builds subject-specific metabolic distance graphs from FDG-PET using KDE to estimate ROI distributions and DTW to measure inter-regional distances, enabling multivariate prediction of eight AD-related cognitive scores with a Kernel SVR framework. Permutation importance reveals edge-level subgraphs that serve as domain-specific or cross-domain network biomarkers, supporting a network-disruption view of Alzheimer’s disease. The approach avoids thresholding of connections and provides explainable biomarkers, achieving up to $r=0.74$ for certain scores and highlighting neurobiological relevance across frontal, parietal, and temporal regions. Limitations include single-cohort validation and reliance on FDG-PET, with future work extending to multimodal data and external datasets to improve generalizability and accuracy.

Abstract

Alzheimer's disease (AD) affects 50 million people worldwide and is projected to overwhelm 152 million by 2050. AD is characterized by cognitive decline due partly to disruptions in metabolic brain connectivity. Thus, early and accurate detection of metabolic brain network impairments is crucial for AD management. Chief to identifying such impairments is FDG-PET data. Despite advancements, most graph-based studies using FDG-PET data rely on group-level analysis or thresholding. Yet, group-level analysis can veil individual differences and thresholding may overlook weaker but biologically critical brain connections. Additionally, machine learning-based AD prediction largely focuses on univariate outcomes, such as disease status. Here, we introduce explainable graph-theoretical machine learning (XGML), a framework employing kernel density estimation and dynamic time warping to construct individual metabolic brain graphs that capture the distance between pair-wise brain regions and identify subgraphs most predictive of multivariate AD-related outcomes. Using FDG-PET data from the Alzheimer's Disease Neuroimaging Initiative, XGML builds metabolic brain graphs and uncovers subgraphs predictive of eight AD-related cognitive scores in new subjects. XGML shows robust performance, particularly for predicting scores measuring learning, memory, language, praxis, and orientation, such as CDRSB ($r = 0.74$), ADAS11 ($r = 0.73$), and ADAS13 ($r = 0.71$). Moreover, XGML unveils key edges jointly but differentially predictive of several AD-related outcomes; they may serve as potential network biomarkers for assessing overall cognitive decline. Together, we show the promise of graph-theoretical machine learning in biomarker discovery and disease prediction and its potential to improve our understanding of network neural mechanisms underlying AD.

Explainable Graph-theoretical Machine Learning: with Application to Alzheimer's Disease Prediction

TL;DR

XGML builds subject-specific metabolic distance graphs from FDG-PET using KDE to estimate ROI distributions and DTW to measure inter-regional distances, enabling multivariate prediction of eight AD-related cognitive scores with a Kernel SVR framework. Permutation importance reveals edge-level subgraphs that serve as domain-specific or cross-domain network biomarkers, supporting a network-disruption view of Alzheimer’s disease. The approach avoids thresholding of connections and provides explainable biomarkers, achieving up to for certain scores and highlighting neurobiological relevance across frontal, parietal, and temporal regions. Limitations include single-cohort validation and reliance on FDG-PET, with future work extending to multimodal data and external datasets to improve generalizability and accuracy.

Abstract

Alzheimer's disease (AD) affects 50 million people worldwide and is projected to overwhelm 152 million by 2050. AD is characterized by cognitive decline due partly to disruptions in metabolic brain connectivity. Thus, early and accurate detection of metabolic brain network impairments is crucial for AD management. Chief to identifying such impairments is FDG-PET data. Despite advancements, most graph-based studies using FDG-PET data rely on group-level analysis or thresholding. Yet, group-level analysis can veil individual differences and thresholding may overlook weaker but biologically critical brain connections. Additionally, machine learning-based AD prediction largely focuses on univariate outcomes, such as disease status. Here, we introduce explainable graph-theoretical machine learning (XGML), a framework employing kernel density estimation and dynamic time warping to construct individual metabolic brain graphs that capture the distance between pair-wise brain regions and identify subgraphs most predictive of multivariate AD-related outcomes. Using FDG-PET data from the Alzheimer's Disease Neuroimaging Initiative, XGML builds metabolic brain graphs and uncovers subgraphs predictive of eight AD-related cognitive scores in new subjects. XGML shows robust performance, particularly for predicting scores measuring learning, memory, language, praxis, and orientation, such as CDRSB (), ADAS11 (), and ADAS13 (). Moreover, XGML unveils key edges jointly but differentially predictive of several AD-related outcomes; they may serve as potential network biomarkers for assessing overall cognitive decline. Together, we show the promise of graph-theoretical machine learning in biomarker discovery and disease prediction and its potential to improve our understanding of network neural mechanisms underlying AD.

Paper Structure

This paper contains 5 sections, 2 equations, 6 figures, 2 tables.

Figures (6)

  • Figure 1: A schematic representation of the explainable graph-theoretical machine learning (XGML) framework. (a) The construction of metabolic distance graphs. (b) From high-dimensional brain graphs to multivariate outcome prediction.
  • Figure 2: Inter- and intra-group metabolic brain network differences and their prediction of individual multivariate AD-related cognitive scores. (a) Metabolic distance graphs derived from FDG-PET scans corresponding to CN, MCI and AD groups. Each graph is shown from four viewpoints (left, right, superior, and posterior), overlaid with edges whose distance values are above $0.80$. (b) Out-of-sample prediction performance of XGML for predicting eight cognitive scores. The x- and y-axis represent observed and predicted values, respectively. Each line indicates goodness-of-fit; each shaded area represents the 95% confidence bands; the accuracy is quantified using the Pearson correlation coefficient (in the bottom right of each panel).
  • Figure 3: The top 10 most predictive edges for each cognitive score. Each box corresponds to a cognitive score and shows the most important edges, from four views: left, posterior, right, and superior, for predicting that score. The color gradient of brain regions reflects their predictive importance, with darker blue indicating higher relevance.
  • Figure 4: The functional distribution of the top ten most predictive edges for each of the eight cognitive scores. Each plot is color-coded to match the seven Yeo networks shown in the boxes below.
  • Figure 5: The average distance value of the top ten most predictive edges between CN (blue) and AD (red) groups. Each bar within a box corresponds to one of the top ten most predicting edges of the corresponding cognitive score. In general, the distance value is higher, across different edges and different cognitive scores, for the AD patients compared to CN subjects.
  • ...and 1 more figures