Table of Contents
Fetching ...

Comparison of Machine Learning Classification Algorithms and Application to the Framingham Heart Study

Nabil Kahouadji

TL;DR

It is introduced and shown that the double discriminant scoring of type I is the most generalizable as it consistently outperforms the other classification algorithms regardless of the training/testing scenario.

Abstract

The use of machine learning algorithms in healthcare can amplify social injustices and health inequities. While the exacerbation of biases can occur and compound during the problem selection, data collection, and outcome definition, this research pertains to some generalizability impediments that occur during the development and the post-deployment of machine learning classification algorithms. Using the Framingham coronary heart disease data as a case study, we show how to effectively select a probability cutoff to convert a regression model for a dichotomous variable into a classifier. We then compare the sampling distribution of the predictive performance of eight machine learning classification algorithms under four training/testing scenarios to test their generalizability and their potential to perpetuate biases. We show that both the Extreme Gradient Boosting, and Support Vector Machine are flawed when trained on an unbalanced dataset. We introduced and show that the double discriminant scoring of type I is the most generalizable as it consistently outperforms the other classification algorithms regardless of the training/testing scenario. Finally, we introduce a methodology to extract an optimal variable hierarchy for a classification algorithm, and illustrate it on the overall, male and female Framingham coronary heart disease data.

Comparison of Machine Learning Classification Algorithms and Application to the Framingham Heart Study

TL;DR

It is introduced and shown that the double discriminant scoring of type I is the most generalizable as it consistently outperforms the other classification algorithms regardless of the training/testing scenario.

Abstract

The use of machine learning algorithms in healthcare can amplify social injustices and health inequities. While the exacerbation of biases can occur and compound during the problem selection, data collection, and outcome definition, this research pertains to some generalizability impediments that occur during the development and the post-deployment of machine learning classification algorithms. Using the Framingham coronary heart disease data as a case study, we show how to effectively select a probability cutoff to convert a regression model for a dichotomous variable into a classifier. We then compare the sampling distribution of the predictive performance of eight machine learning classification algorithms under four training/testing scenarios to test their generalizability and their potential to perpetuate biases. We show that both the Extreme Gradient Boosting, and Support Vector Machine are flawed when trained on an unbalanced dataset. We introduced and show that the double discriminant scoring of type I is the most generalizable as it consistently outperforms the other classification algorithms regardless of the training/testing scenario. Finally, we introduce a methodology to extract an optimal variable hierarchy for a classification algorithm, and illustrate it on the overall, male and female Framingham coronary heart disease data.
Paper Structure (8 sections, 4 figures, 12 tables)

This paper contains 8 sections, 4 figures, 12 tables.

Figures (4)

  • Figure 1: One simulation for the Logistic and Random Forest Classifier Cutoffs for four training/testing scenarios
  • Figure 2: Sampling distribution Logistic and Random Forest Classifier Cutoffs for four training/testing scenarios
  • Figure 3: Mean of 100 True Positive Rates for eight classification algorithms as functions of the training ratio across four training/testing scenarios
  • Figure 4: Mean of 100 True Positive Rates for eight classification algorithms as functions of the training ratio across four training/testing scenarios for the Framingham CHD Male and Female Data