Table of Contents
Fetching ...

Synthetic Augmentation for Anatomical Landmark Localization using DDPMs

Arnela Hadzic, Lea Bogensperger, Simon Johannes Joham, Martin Urschler

TL;DR

This study explores the use of denoising diffusion probabilistic models (DDPMs) for generating medical images and their corresponding heatmaps of landmarks to enhance the training of a supervised deep learning model for ALL.

Abstract

Deep learning techniques for anatomical landmark localization (ALL) have shown great success, but their reliance on large annotated datasets remains a problem due to the tedious and costly nature of medical data acquisition and annotation. While traditional data augmentation, variational autoencoders (VAEs), and generative adversarial networks (GANs) have already been used to synthetically expand medical datasets, diffusion-based generative models have recently started to gain attention for their ability to generate high-quality synthetic images. In this study, we explore the use of denoising diffusion probabilistic models (DDPMs) for generating medical images and their corresponding heatmaps of landmarks to enhance the training of a supervised deep learning model for ALL. Our novel approach involves a DDPM with a 2-channel input, incorporating both the original medical image and its heatmap of annotated landmarks. We also propose a novel way to assess the quality of the generated images using a Markov Random Field (MRF) model for landmark matching and a Statistical Shape Model (SSM) to check landmark plausibility, before we evaluate the DDPM-augmented dataset in the context of an ALL task involving hand X-Rays.

Synthetic Augmentation for Anatomical Landmark Localization using DDPMs

TL;DR

This study explores the use of denoising diffusion probabilistic models (DDPMs) for generating medical images and their corresponding heatmaps of landmarks to enhance the training of a supervised deep learning model for ALL.

Abstract

Deep learning techniques for anatomical landmark localization (ALL) have shown great success, but their reliance on large annotated datasets remains a problem due to the tedious and costly nature of medical data acquisition and annotation. While traditional data augmentation, variational autoencoders (VAEs), and generative adversarial networks (GANs) have already been used to synthetically expand medical datasets, diffusion-based generative models have recently started to gain attention for their ability to generate high-quality synthetic images. In this study, we explore the use of denoising diffusion probabilistic models (DDPMs) for generating medical images and their corresponding heatmaps of landmarks to enhance the training of a supervised deep learning model for ALL. Our novel approach involves a DDPM with a 2-channel input, incorporating both the original medical image and its heatmap of annotated landmarks. We also propose a novel way to assess the quality of the generated images using a Markov Random Field (MRF) model for landmark matching and a Statistical Shape Model (SSM) to check landmark plausibility, before we evaluate the DDPM-augmented dataset in the context of an ALL task involving hand X-Rays.

Paper Structure

This paper contains 14 sections, 5 equations, 7 figures, 2 tables.

Figures (7)

  • Figure 1: Illustration of the forward and reverse diffusion processes (best visible in the pdf version). The forward diffusion process $q(x_t|x_{t-1})$ introduces noise to a 2-channel input sample $x_0$ over T timesteps, while the reverse diffusion process $p_\theta(x_{t-1}|x_t)$ denoises the noisy sample $x_T$ to recover the original input sample $x_0$.
  • Figure 2: Qualitative results of DDPM samples with MRF labeling (colored points). Samples a) and b) were automatically selected in the FullDataset experiment, c) was selected in the ReducedDataset experiment, while images like in d) were automatically rejected via our proposed assessment strategy due to abnormalities such as six fingers or a very long thumb.
  • Figure 3: Example of occluded fingertip, where synthetic DDPM augmentation enables recovery from localization errors. Predicted landmarks are represented by colored points, with red lines indicating their distance to the ground truth landmarks.
  • Figure 4: MRF topology of a random training sample.
  • Figure 5: Image/heatmap pairs generated by the DDPM in the FullDataset scenario (green frame - automatically accepted samples, red frame - automatically rejected samples).
  • ...and 2 more figures