Table of Contents
Fetching ...

What Does DALL-E 2 Know About Radiology?

Lisa C. Adams, Felix Busch, Daniel Truhn, Marcus R. Makowski, Hugo JWL. Aerts, Keno K. Bressem

TL;DR

Problem: Radiology datasets are often sparse and heterogeneous, limiting AI development. Approach: The study probes DALL-E 2's radiology knowledge by prompting text-to-image generation, inpainting of radiographs, and extending images beyond borders across X-ray, CT/MRI, and ultrasound. Key findings: DALL-E 2 models X-ray anatomy plausibly and can inpaint and extend images, but cross-sectional modalities and ultrasound outputs are weak, and pathological generation is constrained by training data and safety filters. Significance: suggests diffusion-based, domain-adapted models could augment radiological data for research, but require targeted fine-tuning and careful handling of pathologies and privacy considerations.

Abstract

Generative models such as DALL-E 2 could represent a promising future tool for image generation, augmentation, and manipulation for artificial intelligence research in radiology provided that these models have sufficient medical domain knowledge. Here we show that DALL-E 2 has learned relevant representations of X-ray images with promising capabilities in terms of zero-shot text-to-image generation of new images, continuation of an image beyond its original boundaries, or removal of elements, while pathology generation or CT, MRI, and ultrasound images are still limited. The use of generative models for augmenting and generating radiological data thus seems feasible, even if further fine-tuning and adaptation of these models to the respective domain is required beforehand.

What Does DALL-E 2 Know About Radiology?

TL;DR

Problem: Radiology datasets are often sparse and heterogeneous, limiting AI development. Approach: The study probes DALL-E 2's radiology knowledge by prompting text-to-image generation, inpainting of radiographs, and extending images beyond borders across X-ray, CT/MRI, and ultrasound. Key findings: DALL-E 2 models X-ray anatomy plausibly and can inpaint and extend images, but cross-sectional modalities and ultrasound outputs are weak, and pathological generation is constrained by training data and safety filters. Significance: suggests diffusion-based, domain-adapted models could augment radiological data for research, but require targeted fine-tuning and careful handling of pathologies and privacy considerations.

Abstract

Generative models such as DALL-E 2 could represent a promising future tool for image generation, augmentation, and manipulation for artificial intelligence research in radiology provided that these models have sufficient medical domain knowledge. Here we show that DALL-E 2 has learned relevant representations of X-ray images with promising capabilities in terms of zero-shot text-to-image generation of new images, continuation of an image beyond its original boundaries, or removal of elements, while pathology generation or CT, MRI, and ultrasound images are still limited. The use of generative models for augmenting and generating radiological data thus seems feasible, even if further fine-tuning and adaptation of these models to the respective domain is required beforehand.
Paper Structure (7 sections, 4 figures)

This paper contains 7 sections, 4 figures.

Figures (4)

  • Figure 1: Sample generated anatomical structures in X-ray from short text descriptions using DALL-E 2.
  • Figure 2: Sample text-to-image generated anatomical structures in CT, MRI, and ultrasound using DALL-E 2.
  • Figure 3: Reconstructed areas of different anatomical locations in X-rays using DALL-E 2. The yellow-bordered regions of the original images were erased before providing the remnant image for reconstruction.
  • Figure 4: Extending X-ray images of different anatomical regions beyond their borders using DALL-E 2. The original X-ray is shown in the yellow boxes, with the remainder of the images being generated by DALL-E 2.