Table of Contents
Fetching ...

An Explainable Ensemble Framework for Alzheimer's Disease Prediction Using Structured Clinical and Cognitive Data

Nishan Mitra

TL;DR

This research introduces an explainable ensemble learning Framework designed to classify individuals as Alzheimer's or Non-Alzheimer's using structured clinical, lifestyle, metabolic, and lifestyle features, offering strong potential for clinical decision support applications.

Abstract

Early and accurate detection of Alzheimer's disease (AD) remains a major challenge in medical diagnosis due to its subtle onset and progressive nature. This research introduces an explainable ensemble learning Framework designed to classify individuals as Alzheimer's or Non-Alzheimer's using structured clinical, lifestyle, metabolic, and lifestyle features. The workflow incorporates rigorous preprocessing, advanced feature engineering, SMOTE-Tomek hybrid class balancing, and optimized modeling using five ensemble algorithms-Random Forest, XGBoost, LightGBM, CatBoost, and Extra Trees-alongside a deep artificial neural network. Model selection was performed using stratified validation to prevent leakage, and the best-performing model was evaluated on a fully unseen test set. Ensemble methods achieved superior performance over deep learning, with XGBoost, Random Forest, and Soft Voting showing the strongest accuracy, sensitivity, and F1-score profiles. Explainability techniques, including SHAP and feature importance analysis, highlighted MMSE, Functional Assessment Age, and several engineered interaction features as the most influential determinants. The results demonstrate that the proposed framework provides a reliable and transparent approach to Alzheimer's disease prediction, offering strong potential for clinical decision support applications.

An Explainable Ensemble Framework for Alzheimer's Disease Prediction Using Structured Clinical and Cognitive Data

TL;DR

This research introduces an explainable ensemble learning Framework designed to classify individuals as Alzheimer's or Non-Alzheimer's using structured clinical, lifestyle, metabolic, and lifestyle features, offering strong potential for clinical decision support applications.

Abstract

Early and accurate detection of Alzheimer's disease (AD) remains a major challenge in medical diagnosis due to its subtle onset and progressive nature. This research introduces an explainable ensemble learning Framework designed to classify individuals as Alzheimer's or Non-Alzheimer's using structured clinical, lifestyle, metabolic, and lifestyle features. The workflow incorporates rigorous preprocessing, advanced feature engineering, SMOTE-Tomek hybrid class balancing, and optimized modeling using five ensemble algorithms-Random Forest, XGBoost, LightGBM, CatBoost, and Extra Trees-alongside a deep artificial neural network. Model selection was performed using stratified validation to prevent leakage, and the best-performing model was evaluated on a fully unseen test set. Ensemble methods achieved superior performance over deep learning, with XGBoost, Random Forest, and Soft Voting showing the strongest accuracy, sensitivity, and F1-score profiles. Explainability techniques, including SHAP and feature importance analysis, highlighted MMSE, Functional Assessment Age, and several engineered interaction features as the most influential determinants. The results demonstrate that the proposed framework provides a reliable and transparent approach to Alzheimer's disease prediction, offering strong potential for clinical decision support applications.
Paper Structure (15 sections, 7 figures, 2 tables)

This paper contains 15 sections, 7 figures, 2 tables.

Figures (7)

  • Figure 1: Workflow of the proposed methodology.
  • Figure 2: Distribution of diagnosis classes in the dataset.
  • Figure 3: Confusion matrices of major models.
  • Figure 4: ROC curves for major models.
  • Figure 5: Gini-based feature importance.
  • ...and 2 more figures