Table of Contents
Fetching ...

cWDM: Conditional Wavelet Diffusion Models for Cross-Modality 3D Medical Image Synthesis

Paul Friedrich, Alicia Durrer, Julia Wolleb, Philippe C. Cattin

TL;DR

A conditional Wavelet Diffusion Model (cWDM) for directly solving a paired image-to-image translation task on high-resolution volumes by combining a Wavelet Diffusion Model for high-resolution 3D image synthesis with a simple conditioning strategy.

Abstract

This paper contributes to the "BraTS 2024 Brain MR Image Synthesis Challenge" and presents a conditional Wavelet Diffusion Model (cWDM) for directly solving a paired image-to-image translation task on high-resolution volumes. While deep learning-based brain tumor segmentation models have demonstrated clear clinical utility, they typically require MR scans from various modalities (T1, T1ce, T2, FLAIR) as input. However, due to time constraints or imaging artifacts, some of these modalities may be missing, hindering the application of well-performing segmentation algorithms in clinical routine. To address this issue, we propose a method that synthesizes one missing modality image conditioned on three available images, enabling the application of downstream segmentation models. We treat this paired image-to-image translation task as a conditional generation problem and solve it by combining a Wavelet Diffusion Model for high-resolution 3D image synthesis with a simple conditioning strategy. This approach allows us to directly apply our model to full-resolution volumes, avoiding artifacts caused by slice- or patch-wise data processing. While this work focuses on a specific application, the presented method can be applied to all kinds of paired image-to-image translation problems, such as CT $\leftrightarrow$ MR and MR $\leftrightarrow$ PET translation, or mask-conditioned anatomically guided image generation.

cWDM: Conditional Wavelet Diffusion Models for Cross-Modality 3D Medical Image Synthesis

TL;DR

A conditional Wavelet Diffusion Model (cWDM) for directly solving a paired image-to-image translation task on high-resolution volumes by combining a Wavelet Diffusion Model for high-resolution 3D image synthesis with a simple conditioning strategy.

Abstract

This paper contributes to the "BraTS 2024 Brain MR Image Synthesis Challenge" and presents a conditional Wavelet Diffusion Model (cWDM) for directly solving a paired image-to-image translation task on high-resolution volumes. While deep learning-based brain tumor segmentation models have demonstrated clear clinical utility, they typically require MR scans from various modalities (T1, T1ce, T2, FLAIR) as input. However, due to time constraints or imaging artifacts, some of these modalities may be missing, hindering the application of well-performing segmentation algorithms in clinical routine. To address this issue, we propose a method that synthesizes one missing modality image conditioned on three available images, enabling the application of downstream segmentation models. We treat this paired image-to-image translation task as a conditional generation problem and solve it by combining a Wavelet Diffusion Model for high-resolution 3D image synthesis with a simple conditioning strategy. This approach allows us to directly apply our model to full-resolution volumes, avoiding artifacts caused by slice- or patch-wise data processing. While this work focuses on a specific application, the presented method can be applied to all kinds of paired image-to-image translation problems, such as CT MR and MR PET translation, or mask-conditioned anatomically guided image generation.

Paper Structure

This paper contains 16 sections, 6 equations, 4 figures, 2 tables, 1 algorithm.

Figures (4)

  • Figure 1: Schematic overview of the proposed pipeline for missing MR image generation - in this case, for a missing FLAIR image. We aim to generate the missing modality image conditioned on the three available images, ultimately allowing for pre-trained downstream task segmentation models to be applied. The same principle applies if another imaging modality is missing. For simplicity, all 3D volumes are displayed as 2D slices.
  • Figure 2: Schematic overview of the proposed conditional Wavelet Diffusion Model - in this case for a missing T1ce image. The process of generating the wavelet coefficients $\tilde{x}_0$ of the output images is conditioned on the wavelet coefficients of the conditioning images by concatenating them with the noisy coefficients in each denoising step.
  • Figure 3: Qualitative results of our proposed method. The synthetic images are generated conditioned on the real images from the three other modalities. We display the middle slice in the axial (top), sagittal (middle), and coronal (bottom) plane.
  • Figure 4: Additional qualitative results of our proposed method. The synthetic images are generated conditioned on the real images from the three other modalities. We display the middle slice in the axial (top), sagittal (middle), and coronal (bottom) plane.