FDDM: Unsupervised Medical Image Translation with a Frequency-Decoupled Diffusion Model

Yunxiang Li; Hua-Chieh Shao; Xiaoxue Qian; You Zhang

FDDM: Unsupervised Medical Image Translation with a Frequency-Decoupled Diffusion Model

Yunxiang Li, Hua-Chieh Shao, Xiaoxue Qian, You Zhang

TL;DR

Frequency decoupled diffusion model (FDDM) for magnetic resonance (MR)-to-computed tomography (CT) conversion is introduced, demonstrating that FDDM can generate high-quality target domain images while maintaining the accuracy of translated anatomical structures, thereby facilitating more precise/accurate downstream tasks including anatomy segmentation and radiotherapy planning.

Abstract

Diffusion models have demonstrated significant potential in producing high-quality images in medical image translation to aid disease diagnosis, localization, and treatment. Nevertheless, current diffusion models have limited success in achieving faithful image translations that can accurately preserve the anatomical structures of medical images, especially for unpaired datasets. The preservation of structural and anatomical details is essential to reliable medical diagnosis and treatment planning, as structural mismatches can lead to disease misidentification and treatment errors. In this study, we introduce the Frequency Decoupled Diffusion Model (FDDM) for MR-to-CT conversion. FDDM first obtains the anatomical information of the CT image from the MR image through an initial conversion module. This anatomical information then guides a subsequent diffusion model to generate high-quality CT images. Our diffusion model uses a dual-path reverse diffusion process for low-frequency and high-frequency information, achieving a better balance between image quality and anatomical accuracy. We extensively evaluated FDDM using public datasets for brain MR-to-CT and pelvis MR-to-CT translations, demonstrating its superior performance to other GAN-based, VAE-based, and diffusion-based models. The evaluation metrics included Frechet Inception Distance (FID), Peak Signal-to-Noise Ratio (PSNR), and Structural Similarity Index Measure (SSIM). FDDM achieved the best scores on all metrics for both datasets, particularly excelling in FID, with scores of 25.9 for brain data and 29.2 for pelvis data, significantly outperforming other methods. These results demonstrate that FDDM can generate high-quality target domain images while maintaining the accuracy of translated anatomical structures.

FDDM: Unsupervised Medical Image Translation with a Frequency-Decoupled Diffusion Model

TL;DR

Abstract

Paper Structure (16 sections, 31 equations, 8 figures, 5 tables)

This paper contains 16 sections, 31 equations, 8 figures, 5 tables.

Introduction
Related Work
Unsupervised Medical Image Translation.
Diffusion Models for Image Translation
Method
Initial Conversion Module
Forward Diffusion Process
Dual-Path Reverse Diffusion Process
Experiments
Experimental and Dataset Setup
Comparison with Other Methods
Brain Dataset
Pelvis Dataset
Ablation Study
Impact of the Forward Diffusion Step $T_s$
...and 1 more sections

Figures (8)

Figure 1: Overview of the FDDM framework for MR-to-CT conversion. We first extract the boundary $S(X)$ of the MR image $X$, and then input the MR image $X$ along with its boundary $S(X)$ into the initial conversion module to obtain a coarse CT $\overline{Y}$ and CT boundary $S(\overline{Y})$. Subsequently, the overall pattern information of CT, $y_{T_s}$, is obtained by subjecting the coarse CT image $\overline{Y}$ to a forward diffusion process (low-pass filter). The CT boundary $S(\overline{Y})$ and noisy CT $y_{T_s}$(overall pattern information) serve as anatomical information to jointly guide the reverse diffusion process to generate the final CT $y_0$.
Figure 2: This figure shows the detailed framework of our dual-path reverse diffusion, where $h_t$ and $l_t$ are the noisy CT images at step $t$ for the explicit probabilistic model and the implicit probabilistic model, respectively. The clean CT images $h_0^{(t)}$ and $l_0^{(t)}$ are predicted by the models and fused using the Laplacian pyramid, resulting in $y_0^{(t)}$. Subsequently, $h_{t-1}$ and $l_{t-1}$ are obtained through different paths.
Figure 3: Diagram of the initial conversion module and the rotation consistency loss.
Figure 4: Visual comparison between other models and FDDM on the brain MR-to-CT translation dataset.
Figure 5: Visual comparison between other models and FDDM on the pelvis MR-to-CT translation dataset.
...and 3 more figures

FDDM: Unsupervised Medical Image Translation with a Frequency-Decoupled Diffusion Model

TL;DR

Abstract

FDDM: Unsupervised Medical Image Translation with a Frequency-Decoupled Diffusion Model

Authors

TL;DR

Abstract

Table of Contents

Figures (8)