Table of Contents
Fetching ...

Recognizing Pneumonia in Real-World Chest X-rays with a Classifier Trained with Images Synthetically Generated by Nano Banana

Jiachuan Peng, Kyle Lam, Jianing Qiu

TL;DR

The paper investigates recognizing pneumonia in chest X-rays by training a classifier on synthetically generated CXRs produced by Nano Banana, testing its generalization on real-world datasets. A ResNet-50 model fine-tuned on cropped synthetic images achieves strong external validation metrics (AUROC up to 0.923 and 0.824; AUPR up to 0.913 and 0.900 on two datasets), demonstrating feasibility of synthetic data for medical AI development. The study highlights the importance of post-generation processing (cropping to remove watermarks) and shows that synthetic-data-trained models focus on clinically relevant lung regions, with additional analyses supporting discriminative feature learning. Nevertheless, limitations in prompt control, generalization to other domains, and regulatory/ethical considerations indicate that substantial validation and oversight are required before clinical deployment.

Abstract

We trained a classifier with synthetic chest X-ray (CXR) images generated by Nano Banana, the latest AI model for image generation and editing, released by Google. When directly applied to real-world CXRs having only been trained with synthetic data, the classifier achieved an AUROC of 0.923 (95% CI: 0.919 - 0.927), and an AUPR of 0.900 (95% CI: 0.894 - 0.907) in recognizing pneumonia in the 2018 RSNA Pneumonia Detection dataset (14,863 CXRs), and an AUROC of 0.824 (95% CI: 0.810 - 0.836), and an AUPR of 0.913 (95% CI: 0.904 - 0.922) in the Chest X-Ray dataset (5,856 CXRs). These external validation results on real-world data demonstrate the feasibility of this approach and suggest potential for synthetic data in medical AI development. Nonetheless, several limitations remain at present, including challenges in prompt design for controlling the diversity of synthetic CXR data and the requirement for post-processing to ensure alignment with real-world data. However, the growing sophistication and accessibility of medical intelligence will necessitate substantial validation, regulatory approval, and ethical oversight prior to clinical translation.

Recognizing Pneumonia in Real-World Chest X-rays with a Classifier Trained with Images Synthetically Generated by Nano Banana

TL;DR

The paper investigates recognizing pneumonia in chest X-rays by training a classifier on synthetically generated CXRs produced by Nano Banana, testing its generalization on real-world datasets. A ResNet-50 model fine-tuned on cropped synthetic images achieves strong external validation metrics (AUROC up to 0.923 and 0.824; AUPR up to 0.913 and 0.900 on two datasets), demonstrating feasibility of synthetic data for medical AI development. The study highlights the importance of post-generation processing (cropping to remove watermarks) and shows that synthetic-data-trained models focus on clinically relevant lung regions, with additional analyses supporting discriminative feature learning. Nevertheless, limitations in prompt control, generalization to other domains, and regulatory/ethical considerations indicate that substantial validation and oversight are required before clinical deployment.

Abstract

We trained a classifier with synthetic chest X-ray (CXR) images generated by Nano Banana, the latest AI model for image generation and editing, released by Google. When directly applied to real-world CXRs having only been trained with synthetic data, the classifier achieved an AUROC of 0.923 (95% CI: 0.919 - 0.927), and an AUPR of 0.900 (95% CI: 0.894 - 0.907) in recognizing pneumonia in the 2018 RSNA Pneumonia Detection dataset (14,863 CXRs), and an AUROC of 0.824 (95% CI: 0.810 - 0.836), and an AUPR of 0.913 (95% CI: 0.904 - 0.922) in the Chest X-Ray dataset (5,856 CXRs). These external validation results on real-world data demonstrate the feasibility of this approach and suggest potential for synthetic data in medical AI development. Nonetheless, several limitations remain at present, including challenges in prompt design for controlling the diversity of synthetic CXR data and the requirement for post-processing to ensure alignment with real-world data. However, the growing sophistication and accessibility of medical intelligence will necessitate substantial validation, regulatory approval, and ethical oversight prior to clinical translation.

Paper Structure

This paper contains 5 sections, 4 figures.

Figures (4)

  • Figure 1: Examples of text-conditional generation of chest radiographs. The top row (1-5) shows synthesized samples of pneumonia, while the bottom row (6-10) depicts healthy subjects. The generated images exhibit diversity across multiple factors including sex, anatomical structures, and medical artifacts. After generation, all images were cropped to remove watermarks and irrelevant regions.
  • Figure 2: Performance comparison between the models fine-tuned on raw CXRs generated by Nano Banana (light blue) , on cropped Nano Banana CXRs (dark blue), and on RoentGen-v2 generated CXRs (yellow) based on AUROC (A, C) and AUPR (B, D). We conducted evaluation separately on the Chest X-Ray dataset (n=5,856) and the 2018 RSNA Pneumonia Detection Challenge dataset (n=14,863). The baselines in AUPR graphs indicate the proportion of pneumonia cases in the dataset.
  • Figure 3: Uniform Manifold Approximation and Projection (UMAP) visualization of features extracted from the classifier fine-tuned on cropped Nano Banana CXRs, on raw synthetic CXRs generated by Nano Banana, on RoentGen-v2 generated synthetic CXRs, and pre-trained on ImageNet (no fine-tuning). We conducted evaluation separately on the Chest X-Ray dataset (n=5,856) and the 2018 RSNA Pneumonia Detection Challenge dataset (n=14,863). For each feature set, we performed K-means clustering and report the resulting classification Accuracy (Acc) and adjusted Rand index (ARI).
  • Figure 4: Grad-CAM visualization of the feature map of the last convolutional layer before the classification head. A: original image; B: fine-tuned on cropped CXRs generated by Nano Banana; C: on raw Nana Banana-generated CXRs; and D: on CXRs generated by RoentGen-v2.