Pulmonologists-Level lung cancer detection based on standard blood test results and smoking status using an explainable machine learning approach

Ricco Noel Hansen Flyckt; Louise Sjodsholm; Margrethe Høstgaard Bang Henriksen; Claus Lohman Brasen; Ali Ebrahimi; Ole Hilberg; Torben Frøstrup Hansen; Uffe Kock Wiil; Lars Henrik Jensen; Abdolrahman Peimankar

Pulmonologists-Level lung cancer detection based on standard blood test results and smoking status using an explainable machine learning approach

Ricco Noel Hansen Flyckt, Louise Sjodsholm, Margrethe Høstgaard Bang Henriksen, Claus Lohman Brasen, Ali Ebrahimi, Ole Hilberg, Torben Frøstrup Hansen, Uffe Kock Wiil, Lars Henrik Jensen, Abdolrahman Peimankar

TL;DR

The model identified smoking status, lactate dehydrogenase, age, total calcium levels, low values of sodium, leucocytes, neutrophil count, and C-reactive protein as the most important factors for LC detection.

Abstract

Lung cancer (LC) remains the primary cause of cancer-related mortality, largely due to late-stage diagnoses. Effective strategies for early detection are therefore of paramount importance. In recent years, machine learning (ML) has demonstrated considerable potential in healthcare by facilitating the detection of various diseases. In this retrospective development and validation study, we developed an ML model based on dynamic ensemble selection (DES) for LC detection. The model leverages standard blood sample analysis and smoking history data from a large population at risk in Denmark. The study includes all patients examined on suspicion of LC in the Region of Southern Denmark from 2009 to 2018. We validated and compared the predictions by the DES model with diagnoses provided by five pulmonologists. Among the 38,944 patients, 9,940 had complete data of which 2,505 (25\%) had LC. The DES model achieved an area under the roc curve of 0.77$\pm$0.01, sensitivity of 76.2\%$\pm$2.4\%, specificity of 63.8\%$\pm$2.3\%, positive predictive value of 41.6\%$\pm$1.2\%, and F\textsubscript{1}-score of 53.8\%$\pm$1.1\%. The DES model outperformed all five pulmonologists, achieving a sensitivity 9\% higher than their average. The model identified smoking status, age, total calcium levels, neutrophil count, and lactate dehydrogenase as the most important factors for the detection of LC. The results highlight the successful application of the ML approach in detecting LC, surpassing pulmonologists' performance. Incorporating clinical and laboratory data in future risk assessment models can improve decision-making and facilitate timely referrals.

Pulmonologists-Level lung cancer detection based on standard blood test results and smoking status using an explainable machine learning approach

TL;DR

Abstract

0.01, sensitivity of 76.2\%

2.4\%, specificity of 63.8\%

2.3\%, positive predictive value of 41.6\%

1.2\%, and F\textsubscript{1}-score of 53.8\%

1.1\%. The DES model outperformed all five pulmonologists, achieving a sensitivity 9\% higher than their average. The model identified smoking status, age, total calcium levels, neutrophil count, and lactate dehydrogenase as the most important factors for the detection of LC. The results highlight the successful application of the ML approach in detecting LC, surpassing pulmonologists' performance. Incorporating clinical and laboratory data in future risk assessment models can improve decision-making and facilitate timely referrals.

Paper Structure (17 sections, 4 figures, 1 table)

This paper contains 17 sections, 4 figures, 1 table.

Introduction
Results
Discussion
Methods
Code and Data availability
Acknowledgements
Author contributions statement
Competing interests

Figures (4)

Figure 1: Flowchart illustrating the LC detection from laboratory and smoking status data. (a) The composition of the study cohort. (b) The inclusion criteria for the data collection of patients who were suspicious of having LC. (c) The workflow of splitting the data into train, validation, and test sets. The train and validation sets are used for the learning process of the model and to minimize the prediction/detection error. The test set of 200 samples are utilized for the comparison between the model’s prediction and five pulmonologists diagnosis. (d) The collected data from different sources are concatenated to be used as inputs for the DES model and to be also provided for the pulmonologists in a fair manner for their diagnoses.
Figure 2: Comparison of evaluation metrics for the validation set using 5-fold cross-validation. (a) Models comparison using sensitivity metric. There is a significant difference between the two highest models (i.e., LGBM and SVM) and LR. (b) Models comparison using specificity metric. There is only significant difference between DES and LR. (c) Models comparison using ROC-AUC metric. There is only significant difference between DES and SVM. (d) Models comparison using F1-score metric. There is no significant difference between the models. The central marker represents mean values along with corresponding standard deviations. The horizontal brackets indicate significant differences in performance, as determined by the Nemenyi post-hoc test, with a two-sided p-value threshold of 0.05.
Figure 3: Assessment of the Dynamic Ensemble Selection Model (DES) through 5-fold cross-validation. (a) Average confusion matrix for 5-fold cross-validation. (b) Average ROC curve for 5-fold cross-validation. The highlighted pink area around the ROC curve represents the standard deviation of 5-fold cross-validation. (c) Predicted probabilities compared to observed LC cases showing the number of patients on the left y-axis and the fraction of patients on the right y-axis. Predicted probabilities are categorized into bins of 0.1. For instance, in the range of 0.7-0.8 (70-80%), the actual fraction of LC cases were 0.55 (55%), corresponding to 1000 patients with LC out of the total cases. (d) Decision curve analyses displaying the relationship between threshold probablilities and the net benefit when utilizing the DES-model for classification of patients at high risk of LC. This is compared to selecting all patients (grey line) or no patients (blue line). The DES-model demonstrates a higher net benefit across threshold probabilities ranging from approximately 7% to 35% compared to the other two clinical strategies. (e) SHAP summary plot with features listed in descending order of importance.
Figure 4: Assessment of the DES model on the 200 samples and the comparison with pulmonologists. (a) Confusion matrix representing the DES model’s prediction versus the actual diagnosis. (b) Confusion matrix of the predictions made by the averaged pulmonologists votes versus the actual diagnosis. (c) ROC curve with the individual pulmonologist’s performance marked by red marks and averaged performance marked by a green dot. (d) Correct predictions of the DES model and averaged pulmonologists in relation to the four stages of lung cancer, alongside the actual distribution of each stage.

Pulmonologists-Level lung cancer detection based on standard blood test results and smoking status using an explainable machine learning approach

TL;DR

Abstract

Pulmonologists-Level lung cancer detection based on standard blood test results and smoking status using an explainable machine learning approach

Authors

TL;DR

Abstract

Table of Contents

Figures (4)