Table of Contents
Fetching ...

Epanechnikov nonparametric kernel density estimation based feature-learning in respiratory disease chest X-ray images

Veronica Marsico, Antonio Quintero-Rincon, Hadj Batatia

TL;DR

This work investigates Epanechnikov nonparametric kernel density estimation (EKDE) as an interpretable feature extractor for chest X-ray classification of respiratory disease. It computes a 2-D feature vector $\phi=[\mu(f̂(z)),\sigma(f̂(z))]$ from EKDE fits and feeds it into a bimodal logistic regression classifier, evaluated on $13808$ X-ray images from the COVID-19 Radiography Dataset. The approach achieves about $70.14\%$ test accuracy with $59.26\%$ sensitivity and $74.18\%$ specificity, corresponding to an ROC AUC near $0.7$, indicating moderate diagnostic capability and room for improvement. The study highlights EKDE's potential for interpretable, low-complexity feature extraction in medical imaging and suggests avenues for enhancement, such as CNN integration, threshold optimization, and expanded datasets with clinical validation.

Abstract

This study presents a novel method for diagnosing respiratory diseases using image data. It combines Epanechnikov's non-parametric kernel density estimation (EKDE) with a bimodal logistic regression classifier in a statistical-model-based learning scheme. EKDE's flexibility in modeling data distributions without assuming specific shapes and its adaptability to pixel intensity variations make it valuable for extracting key features from medical images. The method was tested on 13808 randomly selected chest X-rays from the COVID-19 Radiography Dataset, achieved an accuracy of 70.14%, a sensitivity of 59.26%, and a specificity of 74.18%, demonstrating moderate performance in detecting respiratory disease while showing room for improvement in sensitivity. While clinical expertise remains essential for further refining the model, this study highlights the potential of EKDE-based approaches to enhance diagnostic accuracy and reliability in medical imaging.

Epanechnikov nonparametric kernel density estimation based feature-learning in respiratory disease chest X-ray images

TL;DR

This work investigates Epanechnikov nonparametric kernel density estimation (EKDE) as an interpretable feature extractor for chest X-ray classification of respiratory disease. It computes a 2-D feature vector from EKDE fits and feeds it into a bimodal logistic regression classifier, evaluated on X-ray images from the COVID-19 Radiography Dataset. The approach achieves about test accuracy with sensitivity and specificity, corresponding to an ROC AUC near , indicating moderate diagnostic capability and room for improvement. The study highlights EKDE's potential for interpretable, low-complexity feature extraction in medical imaging and suggests avenues for enhancement, such as CNN integration, threshold optimization, and expanded datasets with clinical validation.

Abstract

This study presents a novel method for diagnosing respiratory diseases using image data. It combines Epanechnikov's non-parametric kernel density estimation (EKDE) with a bimodal logistic regression classifier in a statistical-model-based learning scheme. EKDE's flexibility in modeling data distributions without assuming specific shapes and its adaptability to pixel intensity variations make it valuable for extracting key features from medical images. The method was tested on 13808 randomly selected chest X-rays from the COVID-19 Radiography Dataset, achieved an accuracy of 70.14%, a sensitivity of 59.26%, and a specificity of 74.18%, demonstrating moderate performance in detecting respiratory disease while showing room for improvement in sensitivity. While clinical expertise remains essential for further refining the model, this study highlights the potential of EKDE-based approaches to enhance diagnostic accuracy and reliability in medical imaging.

Paper Structure

This paper contains 8 sections, 9 equations, 6 figures, 3 tables.

Figures (6)

  • Figure 1: Examples of chest X-ray (a) COVID-19. (b) Normal.
  • Figure 2: Scatter plots for the EKDE parameters ($\mu,\sigma$) for all classes. \ref{['fig:datan']}$10192$ normal CXR images, and \ref{['fig:datac']}$3616$ COVID-19 CXR images. \ref{['fig:datacn']}. Note that the COVID-19 images have a larger spread. The goal is to determine whether this separation is enough to discriminate between classes.
  • Figure 3: EKDE fitting examples for the studied observations: (a-b) COVID-19 CXR images, with its EKDE in red, (c-d) Normal CXR images, with its EKDE in blue. Note that the histograms are bimodal but slightly different shapes, which we aim to capture using the classifier.
  • Figure 4: Confusion matrix of the proposed classifier in the training and testing stage. Class $1$ under $3616$ COVID-19 cases, and class $0$ under $10192$ normal cases.
  • Figure 5: COVID-19 CXR images ROCs
  • ...and 1 more figures