Table of Contents
Fetching ...

AI-Driven Predictive Analytics Approach for Early Prognosis of Chronic Kidney Disease Using Ensemble Learning and Explainable AI

K M Tawsik Jawad, Anusha Verma, Fathi Amsaad, Lamia Ashraf

TL;DR

The paper tackles early CKD prognosis by combining ensemble learning with Explainable AI to produce accurate predictions and interpretable explanations from blood and urine test data. It pursues a rigorously preprocessed six-feature subspace, balances data with SMOTE, and evaluates multiple classifiers with a focus on feature contributions and case-level explanations via LIME, SHAP, ALE, and counterfactuals. Random Forest emerges as the top performer, while XGBoost and AdaBoost offer strong fidelity and interpretability trade-offs, demonstrated through novel interpretability metrics. The work demonstrates the practical potential of explainable CKD prognostics for clinician-guided lifestyle interventions, while acknowledging dataset limitations and proposing richer, multi-center data and secure deployment as future directions.

Abstract

Chronic Kidney Disease (CKD) is one of the widespread Chronic diseases with no known ultimo cure and high morbidity. Research demonstrates that progressive Chronic Kidney Disease (CKD) is a heterogeneous disorder that significantly impacts kidney structure and functions, eventually leading to kidney failure. With the progression of time, chronic kidney disease has moved from a life-threatening disease affecting few people to a common disorder of varying severity. The goal of this research is to visualize dominating features, feature scores, and values exhibited for early prognosis and detection of CKD using ensemble learning and explainable AI. For that, an AI-driven predictive analytics approach is proposed to aid clinical practitioners in prescribing lifestyle modifications for individual patients to reduce the rate of progression of this disease. Our dataset is collected on body vitals from individuals with CKD and healthy subjects to develop our proposed AI-driven solution accurately. In this regard, blood and urine test results are provided, and ensemble tree-based machine-learning models are applied to predict unseen cases of CKD. Our research findings are validated after lengthy consultations with nephrologists. Our experiments and interpretation results are compared with existing explainable AI applications in various healthcare domains, including CKD. The comparison shows that our developed AI models, particularly the Random Forest model, have identified more features as significant contributors than XgBoost. Interpretability (I), which measures the ratio of important to masked features, indicates that our XgBoost model achieved a higher score, specifically a Fidelity of 98\%, in this metric and naturally in the FII index compared to competing models.

AI-Driven Predictive Analytics Approach for Early Prognosis of Chronic Kidney Disease Using Ensemble Learning and Explainable AI

TL;DR

The paper tackles early CKD prognosis by combining ensemble learning with Explainable AI to produce accurate predictions and interpretable explanations from blood and urine test data. It pursues a rigorously preprocessed six-feature subspace, balances data with SMOTE, and evaluates multiple classifiers with a focus on feature contributions and case-level explanations via LIME, SHAP, ALE, and counterfactuals. Random Forest emerges as the top performer, while XGBoost and AdaBoost offer strong fidelity and interpretability trade-offs, demonstrated through novel interpretability metrics. The work demonstrates the practical potential of explainable CKD prognostics for clinician-guided lifestyle interventions, while acknowledging dataset limitations and proposing richer, multi-center data and secure deployment as future directions.

Abstract

Chronic Kidney Disease (CKD) is one of the widespread Chronic diseases with no known ultimo cure and high morbidity. Research demonstrates that progressive Chronic Kidney Disease (CKD) is a heterogeneous disorder that significantly impacts kidney structure and functions, eventually leading to kidney failure. With the progression of time, chronic kidney disease has moved from a life-threatening disease affecting few people to a common disorder of varying severity. The goal of this research is to visualize dominating features, feature scores, and values exhibited for early prognosis and detection of CKD using ensemble learning and explainable AI. For that, an AI-driven predictive analytics approach is proposed to aid clinical practitioners in prescribing lifestyle modifications for individual patients to reduce the rate of progression of this disease. Our dataset is collected on body vitals from individuals with CKD and healthy subjects to develop our proposed AI-driven solution accurately. In this regard, blood and urine test results are provided, and ensemble tree-based machine-learning models are applied to predict unseen cases of CKD. Our research findings are validated after lengthy consultations with nephrologists. Our experiments and interpretation results are compared with existing explainable AI applications in various healthcare domains, including CKD. The comparison shows that our developed AI models, particularly the Random Forest model, have identified more features as significant contributors than XgBoost. Interpretability (I), which measures the ratio of important to masked features, indicates that our XgBoost model achieved a higher score, specifically a Fidelity of 98\%, in this metric and naturally in the FII index compared to competing models.
Paper Structure (31 sections, 7 equations, 16 figures, 12 tables)

This paper contains 31 sections, 7 equations, 16 figures, 12 tables.

Figures (16)

  • Figure 1: Workflow Diagram of Overall Research Process
  • Figure 2: Cumulative Confusion Matrix for Random Forest
  • Figure 3: Decision Tree for Best Fold of Random Forest
  • Figure 4: Random Forest Feature Importances
  • Figure 5: Random Test Case (1) Explanations by Lime
  • ...and 11 more figures