Table of Contents
Fetching ...

CoVScreen: Pitfalls and recommendations for screening COVID-19 using Chest X-rays

Sonit Singh

TL;DR

The paper addresses pitfalls in using chest X-rays for COVID-19 screening by highlighting data quality, imbalanced datasets, and evaluation biases. It introduces CoVScreen, a DenseNet121-based CNN trained on a large-scale, multi-source, curated CXR dataset with preprocessing and class-weighted training to mitigate bias, and analyzes performance across severity levels and pediatric-vs-adult differences using 5-fold cross-validation. Key findings show modest binary discrimination on a balanced 2-class dataset, with accuracy heavily influenced by disease severity and data source, revealing strong age- and modality-related biases that challenge generalization. The work underscores the need for rigorous dataset curation, severity-aware evaluation, and explainability in AI-assisted radiology, offering a path toward more reliable and clinically interpretable screening tools.

Abstract

The novel coronavirus (COVID-19), a highly infectious respiratory disease caused by the SARS-CoV-2 has emerged as an unprecedented healthcare crisis. The pandemic had a devastating impact on the health, well-being, and economy of the global population. Early screening and diagnosis of symptomatic patients plays crucial role in isolation of patient to help stop community transmission as well as providing early treatment helping in reducing the mortality rate. Although, the RT-PCR test is the gold standard for COVID-19 testing, it is a manual, laborious, time consuming, uncomfortable, and invasive process. Due to its accessibility, availability, lower-cost, ease of sanitisation, and portable setup, chest X-Ray imaging can serve as an effective screening and diagnostic tool. In this study, we first highlight limitations of existing datasets and studies in terms of data quality, data imbalance, and evaluation strategy. Second, we curated a large-scale COVID-19 chest X-ray dataset from many publicly available COVID-19 imaging databases and proposed a pre-processing pipeline to improve quality of the dataset. We proposed CoVScreen, an CNN architecture to train and test the curated dataset. The experimental results applying different classification scenarios on the curated dataset in terms of various evaluation metrics demonstrate the effectiveness of proposed methodology in the screening of COVID-19 infection.

CoVScreen: Pitfalls and recommendations for screening COVID-19 using Chest X-rays

TL;DR

The paper addresses pitfalls in using chest X-rays for COVID-19 screening by highlighting data quality, imbalanced datasets, and evaluation biases. It introduces CoVScreen, a DenseNet121-based CNN trained on a large-scale, multi-source, curated CXR dataset with preprocessing and class-weighted training to mitigate bias, and analyzes performance across severity levels and pediatric-vs-adult differences using 5-fold cross-validation. Key findings show modest binary discrimination on a balanced 2-class dataset, with accuracy heavily influenced by disease severity and data source, revealing strong age- and modality-related biases that challenge generalization. The work underscores the need for rigorous dataset curation, severity-aware evaluation, and explainability in AI-assisted radiology, offering a path toward more reliable and clinically interpretable screening tools.

Abstract

The novel coronavirus (COVID-19), a highly infectious respiratory disease caused by the SARS-CoV-2 has emerged as an unprecedented healthcare crisis. The pandemic had a devastating impact on the health, well-being, and economy of the global population. Early screening and diagnosis of symptomatic patients plays crucial role in isolation of patient to help stop community transmission as well as providing early treatment helping in reducing the mortality rate. Although, the RT-PCR test is the gold standard for COVID-19 testing, it is a manual, laborious, time consuming, uncomfortable, and invasive process. Due to its accessibility, availability, lower-cost, ease of sanitisation, and portable setup, chest X-Ray imaging can serve as an effective screening and diagnostic tool. In this study, we first highlight limitations of existing datasets and studies in terms of data quality, data imbalance, and evaluation strategy. Second, we curated a large-scale COVID-19 chest X-ray dataset from many publicly available COVID-19 imaging databases and proposed a pre-processing pipeline to improve quality of the dataset. We proposed CoVScreen, an CNN architecture to train and test the curated dataset. The experimental results applying different classification scenarios on the curated dataset in terms of various evaluation metrics demonstrate the effectiveness of proposed methodology in the screening of COVID-19 infection.
Paper Structure (20 sections, 7 figures, 10 tables)

This paper contains 20 sections, 7 figures, 10 tables.

Figures (7)

  • Figure 1: Diversity and irregularities in CXRs of COVID-19 dataset.
  • Figure 2: Results of data pre-processing pipeline.
  • Figure 3: Results of markers removal pipeline.
  • Figure 4: Differences in chest X-rays from paediatric patients vs. COVID-19 chest X-ray from adult population.
  • Figure 5: Differences in ribs positioning visible in chest X-rays from paediatric patients vs. chest X-ray from adult population. Source: https://www.rch.org.au/trauma-service/manual/how-are-children-different/
  • ...and 2 more figures