Table of Contents
Fetching ...

Supervised Learning Models for Early Detection of Albuminuria Risk in Type-2 Diabetes Mellitus Patients

Arief Purnama Muharram, Dicky Levenus Tahapary, Yeni Dwi Lestari, Randy Sarayar, Valerie Josephine Dirjayanto

TL;DR

This study addresses predicting the risk of albuminuria—an early marker of diabetic nephropathy—in patients with type-2 diabetes using supervised learning. It compares seven models, including six traditional algorithms and a Multi-Layer Perceptron, on a private Jakarta dataset of 184 T2DM patients with 10 numerical features, labeled by KDIGO 2012 criteria. The MLP emerges as the top performer with accuracy 0.74 and F1-score 0.71, outperforming other models (e.g., Naïve Bayes at 0.65 accuracy), though overall performance suggests the need for more diverse data. The findings support the potential of deep learning for screening albuminuria in T2DM, while highlighting data size and heterogeneity as key factors for improvement and generalizability.

Abstract

Diabetes, especially T2DM, continues to be a significant health problem. One of the major concerns associated with diabetes is the development of its complications. Diabetic nephropathy, one of the chronic complication of diabetes, adversely affects the kidneys, leading to kidney damage. Diagnosing diabetic nephropathy involves considering various criteria, one of which is the presence of a pathologically significant quantity of albumin in urine, known as albuminuria. Thus, early prediction of albuminuria in diabetic patients holds the potential for timely preventive measures. This study aimed to develop a supervised learning model to predict the risk of developing albuminuria in T2DM patients. The selected supervised learning algorithms included Naïve Bayes, Support Vector Machine (SVM), decision tree, random forest, AdaBoost, XGBoost, and Multi-Layer Perceptron (MLP). Our private dataset, comprising 184 entries of diabetes complications risk factors, was used to train the algorithms. It consisted of 10 attributes as features and 1 attribute as the target (albuminuria). Upon conducting the experiments, the MLP demonstrated superior performance compared to the other algorithms. It achieved accuracy and f1-score values as high as 0.74 and 0.75, respectively, making it suitable for screening purposes in predicting albuminuria in T2DM. Nonetheless, further studies are warranted to enhance the model's performance.

Supervised Learning Models for Early Detection of Albuminuria Risk in Type-2 Diabetes Mellitus Patients

TL;DR

This study addresses predicting the risk of albuminuria—an early marker of diabetic nephropathy—in patients with type-2 diabetes using supervised learning. It compares seven models, including six traditional algorithms and a Multi-Layer Perceptron, on a private Jakarta dataset of 184 T2DM patients with 10 numerical features, labeled by KDIGO 2012 criteria. The MLP emerges as the top performer with accuracy 0.74 and F1-score 0.71, outperforming other models (e.g., Naïve Bayes at 0.65 accuracy), though overall performance suggests the need for more diverse data. The findings support the potential of deep learning for screening albuminuria in T2DM, while highlighting data size and heterogeneity as key factors for improvement and generalizability.

Abstract

Diabetes, especially T2DM, continues to be a significant health problem. One of the major concerns associated with diabetes is the development of its complications. Diabetic nephropathy, one of the chronic complication of diabetes, adversely affects the kidneys, leading to kidney damage. Diagnosing diabetic nephropathy involves considering various criteria, one of which is the presence of a pathologically significant quantity of albumin in urine, known as albuminuria. Thus, early prediction of albuminuria in diabetic patients holds the potential for timely preventive measures. This study aimed to develop a supervised learning model to predict the risk of developing albuminuria in T2DM patients. The selected supervised learning algorithms included Naïve Bayes, Support Vector Machine (SVM), decision tree, random forest, AdaBoost, XGBoost, and Multi-Layer Perceptron (MLP). Our private dataset, comprising 184 entries of diabetes complications risk factors, was used to train the algorithms. It consisted of 10 attributes as features and 1 attribute as the target (albuminuria). Upon conducting the experiments, the MLP demonstrated superior performance compared to the other algorithms. It achieved accuracy and f1-score values as high as 0.74 and 0.75, respectively, making it suitable for screening purposes in predicting albuminuria in T2DM. Nonetheless, further studies are warranted to enhance the model's performance.
Paper Structure (9 sections, 4 equations, 4 figures, 3 tables)

This paper contains 9 sections, 4 equations, 4 figures, 3 tables.

Figures (4)

  • Figure 1: Dataset distribution
  • Figure 2: Simplified diabetic nephropathy mechanism
  • Figure 3: Distribution of the train-test dataset
  • Figure 4: Error analysis