Table of Contents
Fetching ...

Deep Learning Approaches for Data Augmentation in Medical Imaging: A Review

Aghiles Kebaili, Jérôme Lapuyade-Lahorgue, Su Ruan

TL;DR

This review addresses data scarcity in medical imaging by evaluating deep generative models for data augmentation. It compares variational autoencoders, generative adversarial networks, and diffusion models across medical tasks such as classification, segmentation, and cross-modal translation, highlighting their respective strengths and limitations. The analysis notes a rising prominence of diffusion models since 2022, while GANs and VAEs remain relevant, particularly in hybrid configurations. The paper emphasizes the potential of hybrid architectures, domain-specific knowledge integration, and efficiency improvements to enhance the realism and usefulness of synthetic medical images in practice.

Abstract

Deep learning has become a popular tool for medical image analysis, but the limited availability of training data remains a major challenge, particularly in the medical field where data acquisition can be costly and subject to privacy regulations. Data augmentation techniques offer a solution by artificially increasing the number of training samples, but these techniques often produce limited and unconvincing results. To address this issue, a growing number of studies have proposed the use of deep generative models to generate more realistic and diverse data that conform to the true distribution of the data. In this review, we focus on three types of deep generative models for medical image augmentation: variational autoencoders, generative adversarial networks, and diffusion models. We provide an overview of the current state of the art in each of these models and discuss their potential for use in different downstream tasks in medical imaging, including classification, segmentation, and cross-modal translation. We also evaluate the strengths and limitations of each model and suggest directions for future research in this field. Our goal is to provide a comprehensive review about the use of deep generative models for medical image augmentation and to highlight the potential of these models for improving the performance of deep learning algorithms in medical image analysis.

Deep Learning Approaches for Data Augmentation in Medical Imaging: A Review

TL;DR

This review addresses data scarcity in medical imaging by evaluating deep generative models for data augmentation. It compares variational autoencoders, generative adversarial networks, and diffusion models across medical tasks such as classification, segmentation, and cross-modal translation, highlighting their respective strengths and limitations. The analysis notes a rising prominence of diffusion models since 2022, while GANs and VAEs remain relevant, particularly in hybrid configurations. The paper emphasizes the potential of hybrid architectures, domain-specific knowledge integration, and efficiency improvements to enhance the realism and usefulness of synthetic medical images in practice.

Abstract

Deep learning has become a popular tool for medical image analysis, but the limited availability of training data remains a major challenge, particularly in the medical field where data acquisition can be costly and subject to privacy regulations. Data augmentation techniques offer a solution by artificially increasing the number of training samples, but these techniques often produce limited and unconvincing results. To address this issue, a growing number of studies have proposed the use of deep generative models to generate more realistic and diverse data that conform to the true distribution of the data. In this review, we focus on three types of deep generative models for medical image augmentation: variational autoencoders, generative adversarial networks, and diffusion models. We provide an overview of the current state of the art in each of these models and discuss their potential for use in different downstream tasks in medical imaging, including classification, segmentation, and cross-modal translation. We also evaluate the strengths and limitations of each model and suggest directions for future research in this field. Our goal is to provide a comprehensive review about the use of deep generative models for medical image augmentation and to highlight the potential of these models for improving the performance of deep learning algorithms in medical image analysis.
Paper Structure (15 sections, 8 equations, 6 figures, 6 tables)

This paper contains 15 sections, 8 equations, 6 figures, 6 tables.

Figures (6)

  • Figure S1: Distribution of publications on deep generative models applied to medical imaging data augmentation as of 2022. (a) The number of publications per architecture type and year. (b) The distribution of publications by modality, with CT and MRI being the most-commonly studied imaging modalities. Note that for cross-modal translation tasks, both the source and target modalities are counted in this plot. (c) The distribution of publications by downstream task, with segmentation and classification being the most common tasks in medical imaging. This figure illustrates the increasing interest in using deep generative models for data augmentation in medical imaging and highlights the diversity of tasks and modalities that have been addressed in the literature.
  • Figure S2: Illustration of the three deep generative models that are commonly used for medical image augmentation: (a) generative adversarial networks (GANs), which consist of a generator and a discriminator network trained adversarially to generate realistic data; (b) variational autoencoders (VAEs), which consist of an encoder and a decoder network trained to reconstruct data and learn a compact latent representation; and (c) diffusion models, which consist of a forward and backward flow of information through a series of steps to model the data distribution.
  • Figure S3: Adapted from Sandfort et al. sandfort2019data, the study presented examples of true contrast CT scans and synthetic non-contrast CT scans generated using a CycleGAN. The left columns show the true contrast CT scans, while the right columns present the synthetic non-contrast CT scans. It is observed that the synthetic non-contrast images generated with CycleGAN appeared convincing, even in the presence of significant abnormalities in the contrast CT scans. The last column on the right displays unrelated examples of non-contrast images. The letters A to F in this figure represent various abnormalities/pathologies, and the arrows indicate their corresponding synthetic non-contrast CT images. However, they are not essential for understanding the main purpose of the figure, which is to demonstrate the generator's ability to produce realistic images.
  • Figure S4: Synthesized MRIs using a diffusion-based probabilistic model (DDPM) ho2020denoising trained on the BraTS2020 dataset. The first row shows a sample of original images, while the second row shows a sample of synthesized images generated using the DDPM.
  • Figure S5: Illustration of the augmentation pipeline for a generative-model-based data augmentation. The input data, $x$, are fed into the generative model, $g$, which synthesizes additional data samples to augment the training set. The downstream architecture, $e$, which may take the form of a convolutional neural network or U-Net, is then trained on a combination of the synthesized data and real data from the training set. The training set is split into training and validation sets, where the validation set contains only real data for evaluation purposes. After training, the model can be evaluated using various test sets.
  • ...and 1 more figures