Table of Contents
Fetching ...

Diffusion Models for Medical Image Analysis: A Comprehensive Survey

Amirhossein Kazerouni, Ehsan Khodapanah Aghdam, Moein Heidari, Reza Azad, Mohsen Fayyaz, Ilker Hacihaliloglu, Dorit Merhof

TL;DR

Diffusion models are rapidly shaping medical image analysis by providing high-quality, diverse samples without heavy reliance on paired data. The survey maps three core formulations—DDPMs, NCSNs, and SDEs—and offers a multi-faceted taxonomy of applications including translation, reconstruction, segmentation, denoising, generation, and anomaly detection. It presents representative methods, practical considerations, and open issues such as speed, latent representations, and privacy, while highlighting successful cross-modality and multi-modal strategies. Collectively, the work serves as a comprehensive guide for researchers and clinicians to leverage diffusion models in diverse medical imaging tasks and to pursue future directions in theory, architecture, and clinical deployment.

Abstract

Denoising diffusion models, a class of generative models, have garnered immense interest lately in various deep-learning problems. A diffusion probabilistic model defines a forward diffusion stage where the input data is gradually perturbed over several steps by adding Gaussian noise and then learns to reverse the diffusion process to retrieve the desired noise-free data from noisy data samples. Diffusion models are widely appreciated for their strong mode coverage and quality of the generated samples despite their known computational burdens. Capitalizing on the advances in computer vision, the field of medical imaging has also observed a growing interest in diffusion models. To help the researcher navigate this profusion, this survey intends to provide a comprehensive overview of diffusion models in the discipline of medical image analysis. Specifically, we introduce the solid theoretical foundation and fundamental concepts behind diffusion models and the three generic diffusion modelling frameworks: diffusion probabilistic models, noise-conditioned score networks, and stochastic differential equations. Then, we provide a systematic taxonomy of diffusion models in the medical domain and propose a multi-perspective categorization based on their application, imaging modality, organ of interest, and algorithms. To this end, we cover extensive applications of diffusion models in the medical domain. Furthermore, we emphasize the practical use case of some selected approaches, and then we discuss the limitations of the diffusion models in the medical domain and propose several directions to fulfill the demands of this field. Finally, we gather the overviewed studies with their available open-source implementations at https://github.com/amirhossein-kz/Awesome-Diffusion-Models-in-Medical-Imaging.

Diffusion Models for Medical Image Analysis: A Comprehensive Survey

TL;DR

Diffusion models are rapidly shaping medical image analysis by providing high-quality, diverse samples without heavy reliance on paired data. The survey maps three core formulations—DDPMs, NCSNs, and SDEs—and offers a multi-faceted taxonomy of applications including translation, reconstruction, segmentation, denoising, generation, and anomaly detection. It presents representative methods, practical considerations, and open issues such as speed, latent representations, and privacy, while highlighting successful cross-modality and multi-modal strategies. Collectively, the work serves as a comprehensive guide for researchers and clinicians to leverage diffusion models in diverse medical imaging tasks and to pursue future directions in theory, architecture, and clinical deployment.

Abstract

Denoising diffusion models, a class of generative models, have garnered immense interest lately in various deep-learning problems. A diffusion probabilistic model defines a forward diffusion stage where the input data is gradually perturbed over several steps by adding Gaussian noise and then learns to reverse the diffusion process to retrieve the desired noise-free data from noisy data samples. Diffusion models are widely appreciated for their strong mode coverage and quality of the generated samples despite their known computational burdens. Capitalizing on the advances in computer vision, the field of medical imaging has also observed a growing interest in diffusion models. To help the researcher navigate this profusion, this survey intends to provide a comprehensive overview of diffusion models in the discipline of medical image analysis. Specifically, we introduce the solid theoretical foundation and fundamental concepts behind diffusion models and the three generic diffusion modelling frameworks: diffusion probabilistic models, noise-conditioned score networks, and stochastic differential equations. Then, we provide a systematic taxonomy of diffusion models in the medical domain and propose a multi-perspective categorization based on their application, imaging modality, organ of interest, and algorithms. To this end, we cover extensive applications of diffusion models in the medical domain. Furthermore, we emphasize the practical use case of some selected approaches, and then we discuss the limitations of the diffusion models in the medical domain and propose several directions to fulfill the demands of this field. Finally, we gather the overviewed studies with their available open-source implementations at https://github.com/amirhossein-kz/Awesome-Diffusion-Models-in-Medical-Imaging.
Paper Structure (22 sections, 13 equations, 17 figures, 1 table)

This paper contains 22 sections, 13 equations, 17 figures, 1 table.

Figures (17)

  • Figure 1: The diagram (a) shows the relative proportion of published papers categorized according to their application and (b) according to their imaging modalities. (c) indicates the number of diffusion-based research papers published in the medical domain. The growth rate per year reveals the importance of diffusion models for future work. It is worth mentioning that the overall number of papers is 103.
  • Figure 2: This figure showcases different generative models and provides an overview of their underlying principles. (\ref{['fig:gan']}) General Adversarial Network (GAN) goodfellow2020generative is an end-to-end pipeline that trains the generator in an adversarial manner to generate samples that the discriminator is capable of distinguishing from the real data sample. (\ref{['fig:ebm']}) Energy-based Model (EBM) lecun2006tutorial, also known as non-normalized probabilistic models, trains in the same way as GANs with two major modifications. First, the discriminator learns a proper energy-based function that maps the data sample to a distribution space. Second, the generator utilizes a prior input to enhance the sample generation performance. (\ref{['fig:vae']}) Variational AutoEncoder (VAE) kingma2013auto is a standalone network that follows a projection from a data sample to a low-dimensional latent space by the encoder and generates by sampling from it via a decoder path. (\ref{['fig:flow']}) Normalizing flow (NF) papamakarios2021normalizing utilizes an invertible flow function to transform input to latent space and generate samples with the inverse flow function. (\ref{['fig:diffusion']}) Diffusion Models intermingle the noise with the input in successive steps until it becomes a noise distribution before applying a reverse process to neutralize the noise addition in each step in the sampling procedure.
  • Figure 3: Generative learning trilemma xiao2022tackling. Despite the ability of GANs to quickly generate high-fidelity samples, their mode coverage is limited. In addition, VAEs and normalizing flows have been revealed to have a great deal of diversity; however, they generally have poor sampling quality. Diffusion models have emerged to compensate for the deficiency of VAEs and GANs by showing adequate mode coverage and high-quality sampling. Nevertheless, due to their iterative nature, which causes a slow sampling process, they are practically expensive and require more improvement.
  • Figure 4: Ten synthetic histopathology images generated by MFDPM moghadam2023morphology.
  • Figure 5: The proposed taxonomy for diffusion-based medical imaging research is built on nine sub-fields: 1) Image-to-Image Translation, 2) Image Reconstruction, 3) Image Registration, 4) Image Classification, 5) Image Segmentation, 6) Image Denoising, 7) Image Generation, 8) Anomaly Detection, and 9) Multi-disciplinary applications, named Other Applications. For the sake of brevity, we utilize the prefix numbers in the paper's name in ascending order and denote the reference for each study as follows: 1. lyu2022conversion, 2. meng2022novel, 3. ozbey2022unsupervised, 4. song2022solving, 5. xie2022measurement, 6. chung2022score, 7. peng2022towards, 8. dar2022adaptive, 9. luo2022mri, 10. cui2022self, 11. chung2022improving, 12. kim2022diffusemorph, 13. yang2023diffmic, 14. fernandez2022can, 15. kim2022vessel, 16. wolleb2022diffusion, 17. gong2022pet, 18. hu2022unsupervised, 19. pinaya2022brain, 20. moghadam2023morphology, 21. waibel2022diffusion, 22. dorjsembe2022three, 23. kim2022diffusion, 24. wyatt2022anoddpm, 25. sanchez2022healthy, 26. wolleb2022swiss, 27. wolleb2022diffusionanomaly, 28. pinaya2022fast, 29. wang2022fight, 30. trippe2023diffusion, 31. chung2022mr.
  • ...and 12 more figures