Memory-Efficient 3D Denoising Diffusion Models for Medical Image Processing

Florentin Bieder; Julia Wolleb; Alicia Durrer; Robin Sandkühler; Philippe C. Cattin

Memory-Efficient 3D Denoising Diffusion Models for Medical Image Processing

Florentin Bieder, Julia Wolleb, Alicia Durrer, Robin Sandkühler, Philippe C. Cattin

TL;DR

The main contribution of this paper is the memory-efficient patch-based diffusion model, PatchDDM, which can be applied to the total volume during inference while the training is performed only on patches.

Abstract

Denoising diffusion models have recently achieved state-of-the-art performance in many image-generation tasks. They do, however, require a large amount of computational resources. This limits their application to medical tasks, where we often deal with large 3D volumes, like high-resolution three-dimensional data. In this work, we present a number of different ways to reduce the resource consumption for 3D diffusion models and apply them to a dataset of 3D images. The main contribution of this paper is the memory-efficient patch-based diffusion model \textit{PatchDDM}, which can be applied to the total volume during inference while the training is performed only on patches. While the proposed diffusion model can be applied to any image generation tasks, we evaluate the method on the tumor segmentation task of the BraTS2020 dataset and demonstrate that we can generate meaningful three-dimensional segmentations.

Memory-Efficient 3D Denoising Diffusion Models for Medical Image Processing

TL;DR

Abstract

Paper Structure (23 sections, 5 equations, 8 figures, 3 tables)

This paper contains 23 sections, 5 equations, 8 figures, 3 tables.

Introduction
Contribution
Related Work
Method
Denoising Diffusion Models
Architecture
Patch-based Approach with Coordinate-encoding
Baseline methods
Denoising Diffusion Models with Ensembling for Segmentation
Experiments
Dataset
Training Details
Accelerated Sampling
Results
Segmentation Ensembling
...and 8 more sections

Figures (8)

Figure 1: Overview of our proposed method PatchDDM. The diffusion model is optimized in memory efficiency and speed by training only on coordinate-encoded patches. The input consists of noised $x_t$, the volumes $b$ that are to be segmented and which are provided as a condition for the segmentation, as well as a coordinate encoding $CE$ for the patches. During sampling, the whole 3D volume can be processed at once.
Figure 2: The architecture of the U-Net-like network with averaging skip connections. In the original network as well as in the U-Net the $\bigoplus$ operator is a concatenation $x = (x_s, x_u)$, in our case it is an averaging operator $x = (x_s + x_u)/2$.
Figure 3: The ground truth segmentation $x_0$ is degraded by the noising process $q$. We train a network to perform the denoising process $p_\vartheta$, that is, given some noised image $x_t$, we train it to denoise it with the MR-sequences $b$ as a condition.
Figure 4: The evaluation metrics on the test set as a function of the ensemble size.
Figure 5: The average Dice score and HD95 metric on the test set as a function of the number of sampling steps and the ensemble size. The white sections indicate that we did not evaluate that combination.
...and 3 more figures

Memory-Efficient 3D Denoising Diffusion Models for Medical Image Processing

TL;DR

Abstract

Memory-Efficient 3D Denoising Diffusion Models for Medical Image Processing

Authors

TL;DR

Abstract

Table of Contents

Figures (8)