Table of Contents
Fetching ...

Multi-Conditioned Denoising Diffusion Probabilistic Model (mDDPM) for Medical Image Synthesis

Arjun Krishna, Ge Wang, Klaus Mueller

TL;DR

This work employs a Denoising Diffusion Probabilistic Model (DDPM) to train a large-scale generative model in the lung CT domain and expands upon a classifier-free sampling strategy to showcase one such generation framework.

Abstract

Medical imaging applications are highly specialized in terms of human anatomy, pathology, and imaging domains. Therefore, annotated training datasets for training deep learning applications in medical imaging not only need to be highly accurate but also diverse and large enough to encompass almost all plausible examples with respect to those specifications. We argue that achieving this goal can be facilitated through a controlled generation framework for synthetic images with annotations, requiring multiple conditional specifications as input to provide control. We employ a Denoising Diffusion Probabilistic Model (DDPM) to train a large-scale generative model in the lung CT domain and expand upon a classifier-free sampling strategy to showcase one such generation framework. We show that our approach can produce annotated lung CT images that can faithfully represent anatomy, convincingly fooling experts into perceiving them as real. Our experiments demonstrate that controlled generative frameworks of this nature can surpass nearly every state-of-the-art image generative model in achieving anatomical consistency in generated medical images when trained on comparable large medical datasets.

Multi-Conditioned Denoising Diffusion Probabilistic Model (mDDPM) for Medical Image Synthesis

TL;DR

This work employs a Denoising Diffusion Probabilistic Model (DDPM) to train a large-scale generative model in the lung CT domain and expands upon a classifier-free sampling strategy to showcase one such generation framework.

Abstract

Medical imaging applications are highly specialized in terms of human anatomy, pathology, and imaging domains. Therefore, annotated training datasets for training deep learning applications in medical imaging not only need to be highly accurate but also diverse and large enough to encompass almost all plausible examples with respect to those specifications. We argue that achieving this goal can be facilitated through a controlled generation framework for synthetic images with annotations, requiring multiple conditional specifications as input to provide control. We employ a Denoising Diffusion Probabilistic Model (DDPM) to train a large-scale generative model in the lung CT domain and expand upon a classifier-free sampling strategy to showcase one such generation framework. We show that our approach can produce annotated lung CT images that can faithfully represent anatomy, convincingly fooling experts into perceiving them as real. Our experiments demonstrate that controlled generative frameworks of this nature can surpass nearly every state-of-the-art image generative model in achieving anatomical consistency in generated medical images when trained on comparable large medical datasets.
Paper Structure (8 sections, 10 equations, 4 figures, 1 table, 1 algorithm)

This paper contains 8 sections, 10 equations, 4 figures, 1 table, 1 algorithm.

Figures (4)

  • Figure 1: Multi-Conditioned Guided Sampling. The blue area represents the image space for all CT lung images; the yellow, green and red circles represent the image space closer to the three guidance images y1, y2 and y3, the size of the circles depends on the images themselves and the downsampling factors n1, n2, n3 of the filter used corresponding to these images.
  • Figure 2: This figure shows six examples of lung CT soft-tissue window 2D image generations with two conditional images. Both left and right sections display three generated images for three different anatomy / segmentation maps for the same reference (conditional) CT image, shown in the red boxes. The generations follow the anatomy of the segmentation maps above but exhibit the slice of the heart generation corresponding to the referenced CT images. The results are displayed in the soft-tissue window to highlight the similarity and accuracy of the generated anatomy w.r.t guidance images.
  • Figure 3: Left-most column (outlined with a red box): images generated with our multi-conditional sampling algorithm, shown at full HU range. Other three columns: these images in their respective bone, soft-tissue, and lung windows.
  • Figure 4: Confusion matrices for the responses of the 3 radiologists. The overall accuracy of the responses is 45.56% which is close to 50%; a requirement for passing our Visual Turing Test. Proportion of 'True Negatives' (fake images identified as fake) is 6.67% whereas proportion of 'False Negatives' (real images identified as fake) is 15.56%