Table of Contents
Fetching ...

Zero-Shot Pediatric Tuberculosis Detection in Chest X-Rays using Self-Supervised Learning

Daniel Capellán-Martín, Abhijeet Parida, Juan J. Gómez-Valverde, Ramon Sanchez-Jacob, Pooneh Roshanitabrizi, Marius G. Linguraru, María J. Ledesma-Carbayo, Syed M. Anwar

TL;DR

This work tackles pediatric TB detection from CXRs by leveraging self-supervised learning on large adult datasets. A Vision Transformer backbone is pre-trained with MAE, MOCO-v3, and DINO on 357k+ CXRs, then fine-tuned for adult TB and evaluated on an unseen pediatric cohort for zero-shot detection. The approach achieves top adult TB metrics of $0.959$ AUC and $0.962$ AUPR, and demonstrates zero-shot pediatric performance up to $0.697$ AUC and $0.607$ AUPR, with notable improvements over fully supervised baselines of $12.\7\%$ AUC and $13.4\%$ AUPR. This indicates that self-supervised adult CXRs can yield transferable representations that improve pediatric TB screening where data are scarce, highlighting the potential of foundation models in clinical radiology.

Abstract

Tuberculosis (TB) remains a significant global health challenge, with pediatric cases posing a major concern. The World Health Organization (WHO) advocates for chest X-rays (CXRs) for TB screening. However, visual interpretation by radiologists can be subjective, time-consuming and prone to error, especially in pediatric TB. Artificial intelligence (AI)-driven computer-aided detection (CAD) tools, especially those utilizing deep learning, show promise in enhancing lung disease detection. However, challenges include data scarcity and lack of generalizability. In this context, we propose a novel self-supervised paradigm leveraging Vision Transformers (ViT) for improved TB detection in CXR, enabling zero-shot pediatric TB detection. We demonstrate improvements in TB detection performance ($\sim$12.7% and $\sim$13.4% top AUC/AUPR gains in adults and children, respectively) when conducting self-supervised pre-training when compared to fully-supervised (i.e., non pre-trained) ViT models, achieving top performances of 0.959 AUC and 0.962 AUPR in adult TB detection, and 0.697 AUC and 0.607 AUPR in zero-shot pediatric TB detection. As a result, this work demonstrates that self-supervised learning on adult CXRs effectively extends to challenging downstream tasks such as pediatric TB detection, where data are scarce.

Zero-Shot Pediatric Tuberculosis Detection in Chest X-Rays using Self-Supervised Learning

TL;DR

This work tackles pediatric TB detection from CXRs by leveraging self-supervised learning on large adult datasets. A Vision Transformer backbone is pre-trained with MAE, MOCO-v3, and DINO on 357k+ CXRs, then fine-tuned for adult TB and evaluated on an unseen pediatric cohort for zero-shot detection. The approach achieves top adult TB metrics of AUC and AUPR, and demonstrates zero-shot pediatric performance up to AUC and AUPR, with notable improvements over fully supervised baselines of AUC and AUPR. This indicates that self-supervised adult CXRs can yield transferable representations that improve pediatric TB screening where data are scarce, highlighting the potential of foundation models in clinical radiology.

Abstract

Tuberculosis (TB) remains a significant global health challenge, with pediatric cases posing a major concern. The World Health Organization (WHO) advocates for chest X-rays (CXRs) for TB screening. However, visual interpretation by radiologists can be subjective, time-consuming and prone to error, especially in pediatric TB. Artificial intelligence (AI)-driven computer-aided detection (CAD) tools, especially those utilizing deep learning, show promise in enhancing lung disease detection. However, challenges include data scarcity and lack of generalizability. In this context, we propose a novel self-supervised paradigm leveraging Vision Transformers (ViT) for improved TB detection in CXR, enabling zero-shot pediatric TB detection. We demonstrate improvements in TB detection performance (12.7% and 13.4% top AUC/AUPR gains in adults and children, respectively) when conducting self-supervised pre-training when compared to fully-supervised (i.e., non pre-trained) ViT models, achieving top performances of 0.959 AUC and 0.962 AUPR in adult TB detection, and 0.697 AUC and 0.607 AUPR in zero-shot pediatric TB detection. As a result, this work demonstrates that self-supervised learning on adult CXRs effectively extends to challenging downstream tasks such as pediatric TB detection, where data are scarce.
Paper Structure (12 sections, 1 equation, 2 figures, 2 tables)

This paper contains 12 sections, 1 equation, 2 figures, 2 tables.

Figures (2)

  • Figure 1: We propose a novel self-supervised strategy leveraging ViTs for improved TB detection in CXR, enabling for zero-shot pediatric TB detection.
  • Figure 2: Receiver operating characteristic (ROC) curves when conducting fine-tuning on adult TB data (MC & SZ training subsets) and: (a) testing on an independent adult TB dataset; (b) performing zero-shot inference on an independent, out-of-domain pediatric TB dataset.