Table of Contents
Fetching ...

Paired Diffusion: Generation of related, synthetic PET-CT-Segmentation scans using Linked Denoising Diffusion Probabilistic Models

Rowan Bradbury, Katherine A. Vallis, Bartlomiej W. Papiez

TL;DR

This research introduces a novel architecture that is able to generate multiple, related PET-CT-tumour mask pairs using paired networks and conditional encoders and requires a modified perceptual loss function to ensure accurate feature alignment.

Abstract

The rapid advancement of Artificial Intelligence (AI) in biomedical imaging and radiotherapy is hindered by the limited availability of large imaging data repositories. With recent research and improvements in denoising diffusion probabilistic models (DDPM), high quality synthetic medical scans are now possible. Despite this, there is currently no way of generating multiple related images, such as a corresponding ground truth which can be used to train models, so synthetic scans are often manually annotated before use. This research introduces a novel architecture that is able to generate multiple, related PET-CT-tumour mask pairs using paired networks and conditional encoders. Our approach includes innovative, time step-controlled mechanisms and a `noise-seeding' strategy to improve DDPM sampling consistency. While our model requires a modified perceptual loss function to ensure accurate feature alignment we show generation of clearly aligned synthetic images and improvement in segmentation accuracy with generated images.

Paired Diffusion: Generation of related, synthetic PET-CT-Segmentation scans using Linked Denoising Diffusion Probabilistic Models

TL;DR

This research introduces a novel architecture that is able to generate multiple, related PET-CT-tumour mask pairs using paired networks and conditional encoders and requires a modified perceptual loss function to ensure accurate feature alignment.

Abstract

The rapid advancement of Artificial Intelligence (AI) in biomedical imaging and radiotherapy is hindered by the limited availability of large imaging data repositories. With recent research and improvements in denoising diffusion probabilistic models (DDPM), high quality synthetic medical scans are now possible. Despite this, there is currently no way of generating multiple related images, such as a corresponding ground truth which can be used to train models, so synthetic scans are often manually annotated before use. This research introduces a novel architecture that is able to generate multiple, related PET-CT-tumour mask pairs using paired networks and conditional encoders. Our approach includes innovative, time step-controlled mechanisms and a `noise-seeding' strategy to improve DDPM sampling consistency. While our model requires a modified perceptual loss function to ensure accurate feature alignment we show generation of clearly aligned synthetic images and improvement in segmentation accuracy with generated images.
Paper Structure (12 sections, 2 equations, 4 figures)

This paper contains 12 sections, 2 equations, 4 figures.

Figures (4)

  • Figure 1: Illustration of the simplified low-pass filter, parameterised by just two variables ranging from 0 to 1. This simpler approach makes it more intuitive to control and faster to learn useful representations compared to learning a full-width, height, and channel size filter. In our model, this filter plays a vital role in diminishing noise from the conditional input (as it is also noised) to focus on semantic features. As all information into the conditional encoders has to pass through this layer we found the non-pretrained FF-Parser to introduce training difficulties by degrading the input, in comparison to our method which enforces noise reduction from initialisation with only two learnable weights. With the Gaussian noise being added across all frequencies we did not believe the ability of FF-Parser to generate more complex noise-reduction schemes useful here. Crucially our filter also allows for timestep-dependent adjustments, enhancing its denoising capability significantly to effectively and efficiently reduce input noise regardless of timestep.
  • Figure 2: Examples of misgenerations caused by 'fast' sampling too aggressively.
  • Figure 3: Examples of image generation with different model training configurations. Columns represent each synthetic modality. Left to right: CT, PET, tumour segmentation.
  • Figure 4: Comparison of fully-trained, single-modality DDPM generated images to assess generation quality.