Table of Contents
Fetching ...

Deep Generative Models for 3D Medical Image Synthesis

Paul Friedrich, Yannik Frisch, Philippe C. Cattin

TL;DR

This chapter explores various deep generative models for 3D medical image synthesis, with a focus on Variational Autoencoders (VAEs), Generative Adversarial Networks (GANs), and Denoising Diffusion Models (DDMs).

Abstract

Deep generative modeling has emerged as a powerful tool for synthesizing realistic medical images, driving advances in medical image analysis, disease diagnosis, and treatment planning. This chapter explores various deep generative models for 3D medical image synthesis, with a focus on Variational Autoencoders (VAEs), Generative Adversarial Networks (GANs), and Denoising Diffusion Models (DDMs). We discuss the fundamental principles, recent advances, as well as strengths and weaknesses of these models and examine their applications in clinically relevant problems, including unconditional and conditional generation tasks like image-to-image translation and image reconstruction. We additionally review commonly used evaluation metrics for assessing image fidelity, diversity, utility, and privacy and provide an overview of current challenges in the field.

Deep Generative Models for 3D Medical Image Synthesis

TL;DR

This chapter explores various deep generative models for 3D medical image synthesis, with a focus on Variational Autoencoders (VAEs), Generative Adversarial Networks (GANs), and Denoising Diffusion Models (DDMs).

Abstract

Deep generative modeling has emerged as a powerful tool for synthesizing realistic medical images, driving advances in medical image analysis, disease diagnosis, and treatment planning. This chapter explores various deep generative models for 3D medical image synthesis, with a focus on Variational Autoencoders (VAEs), Generative Adversarial Networks (GANs), and Denoising Diffusion Models (DDMs). We discuss the fundamental principles, recent advances, as well as strengths and weaknesses of these models and examine their applications in clinically relevant problems, including unconditional and conditional generation tasks like image-to-image translation and image reconstruction. We additionally review commonly used evaluation metrics for assessing image fidelity, diversity, utility, and privacy and provide an overview of current challenges in the field.

Paper Structure

This paper contains 20 sections, 11 equations, 7 figures, 3 tables.

Figures (7)

  • Figure 1: Differences between natural and medical images. Natural 2D images deng2009imagenetkhosla2011dogsyu2015lsun are widely available in large-scale datasets, as they can easily be scraped from the internet. In contrast, 3D medical data bakas2017advancingbakas2018identifyingchilamkurthy2018developmentarmato2011lungmenze2014multimodal is scarce due to the high cost of acquisition, as well as ethical and privacy concerns.
  • Figure 2: The basic principle of generative modeling. Using data from the data distribution $p_{data}$, we try to find a model $p_{model}$ that closely follows this distribution. We can then use this model to generate new samples that resemble the original data distribution.
  • Figure 3: The basic principle of Variational Autoencoders. An input image $x$ is encoded into a KL regularized latent representation $z = E(x)$ and is subsequently reconstructed as $x'=D(z)$. By minimizing the reconstruction error, as well as the KL-divergence between the latent and a standard normal distribution, the model learns to generate new data and encode data in a meaningful way.
  • Figure 4: The basic principle of Generative Adversarial Networks. The generator $G$ and the discriminator $D$ play an adversarial game against each other, where the generator tries to synthesize realistic images that the discriminator cannot distinguish from the real training data.
  • Figure 5: The basic principle of Denoising Diffusion Models. The diffusion model consists of two main components: a fixed diffusion process that gradually perturbs input data with Gaussian noise and maps the data distribution to a simple prior, and a learned reverse process with each transition being a Gaussian parameterized by a time-conditioned neural network.
  • ...and 2 more figures