Table of Contents
Fetching ...

A Flow-based Truncated Denoising Diffusion Model for Super-resolution Magnetic Resonance Spectroscopic Imaging

Siyuan Dong, Zhuotong Cai, Gilbert Hangel, Wolfgang Bogner, Georg Widhalm, Yaqing Huang, Qinghao Liang, Chenyu You, Chathura Kumaragamage, Robert K. Fulbright, Amit Mahajan, Amin Karbasi, John A. Onofrey, Robin A. de Graaf, James S. Duncan

TL;DR

A Flow-based Truncated Denoising Diffusion Model (FTDDM) is introduced for super-resolution MRSI, which shortens the diffusion process by truncating the diffusion chain, and the truncated steps are estimated using a normalizing flow-based network.

Abstract

Magnetic Resonance Spectroscopic Imaging (MRSI) is a non-invasive imaging technique for studying metabolism and has become a crucial tool for understanding neurological diseases, cancers and diabetes. High spatial resolution MRSI is needed to characterize lesions, but in practice MRSI is acquired at low resolution due to time and sensitivity restrictions caused by the low metabolite concentrations. Therefore, there is an imperative need for a post-processing approach to generate high-resolution MRSI from low-resolution data that can be acquired fast and with high sensitivity. Deep learning-based super-resolution methods provided promising results for improving the spatial resolution of MRSI, but they still have limited capability to generate accurate and high-quality images. Recently, diffusion models have demonstrated superior learning capability than other generative models in various tasks, but sampling from diffusion models requires iterating through a large number of diffusion steps, which is time-consuming. This work introduces a Flow-based Truncated Denoising Diffusion Model (FTDDM) for super-resolution MRSI, which shortens the diffusion process by truncating the diffusion chain, and the truncated steps are estimated using a normalizing flow-based network. The network is conditioned on upscaling factors to enable multi-scale super-resolution. To train and evaluate the deep learning models, we developed a 1H-MRSI dataset acquired from 25 high-grade glioma patients. We demonstrate that FTDDM outperforms existing generative models while speeding up the sampling process by over 9-fold compared to the baseline diffusion model. Neuroradiologists' evaluations confirmed the clinical advantages of our method, which also supports uncertainty estimation and sharpness adjustment, extending its potential clinical applications.

A Flow-based Truncated Denoising Diffusion Model for Super-resolution Magnetic Resonance Spectroscopic Imaging

TL;DR

A Flow-based Truncated Denoising Diffusion Model (FTDDM) is introduced for super-resolution MRSI, which shortens the diffusion process by truncating the diffusion chain, and the truncated steps are estimated using a normalizing flow-based network.

Abstract

Magnetic Resonance Spectroscopic Imaging (MRSI) is a non-invasive imaging technique for studying metabolism and has become a crucial tool for understanding neurological diseases, cancers and diabetes. High spatial resolution MRSI is needed to characterize lesions, but in practice MRSI is acquired at low resolution due to time and sensitivity restrictions caused by the low metabolite concentrations. Therefore, there is an imperative need for a post-processing approach to generate high-resolution MRSI from low-resolution data that can be acquired fast and with high sensitivity. Deep learning-based super-resolution methods provided promising results for improving the spatial resolution of MRSI, but they still have limited capability to generate accurate and high-quality images. Recently, diffusion models have demonstrated superior learning capability than other generative models in various tasks, but sampling from diffusion models requires iterating through a large number of diffusion steps, which is time-consuming. This work introduces a Flow-based Truncated Denoising Diffusion Model (FTDDM) for super-resolution MRSI, which shortens the diffusion process by truncating the diffusion chain, and the truncated steps are estimated using a normalizing flow-based network. The network is conditioned on upscaling factors to enable multi-scale super-resolution. To train and evaluate the deep learning models, we developed a 1H-MRSI dataset acquired from 25 high-grade glioma patients. We demonstrate that FTDDM outperforms existing generative models while speeding up the sampling process by over 9-fold compared to the baseline diffusion model. Neuroradiologists' evaluations confirmed the clinical advantages of our method, which also supports uncertainty estimation and sharpness adjustment, extending its potential clinical applications.

Paper Structure

This paper contains 22 sections, 12 equations, 11 figures, 4 tables, 1 algorithm.

Figures (11)

  • Figure 1: A comparison between (a) the conventional diffusion model DDPM and (b) our method FTDDM. $\mathbf{x}_0$ is the noiseless high-resolution MRSI metabolic map. The forward diffusion process gradually adds Gaussian noise to $\mathbf{x}_0$. The noise is only added to the region of interest, as defined by a quality-filtering mask, to avoid the necessity of suppressing noise from the background during the reverse diffusion process. The reverse diffusion process uses a denoising network with parameters $\theta$ to retrace the forward diffusion process, provided with any condition images $\mathbf{y}$. $F^{-1}_{\phi}$ is the inverse of a normalizing flow-based network used to bridge the gap between the pure Gaussian noise $\mathbf{z}$ and the noisy image at the truncation point $\mathbf{x}_{T_{trunc}}$.
  • Figure 2: Overview of the proposed method. The truncated denoising diffusion employs a Denoising UNet to iteratively estimate and remove noise from $\mathbf{x}_{T_{trunc}}$, resulting in a noiseless high-resolution MRSI metabolic map $\mathbf{x}_0$. The Denoising UNet also takes the condition $\mathbf{y}$, which is a concatenation of the low-resolution (LR) metabolic map, a quality-filtering mask $M$, T1 MRI and FLAIR MRI, i.e. $\mathbf{y}=\{$LR, $M$, T1, FLAIR$\}$. The Denoising UNet consists of Residual Network (ResNet) blocks and Conditional Instance Normalization (CIN). The CIN embeds timestep $t$ and upscaling factor $s$ into the network. The blocks in the middle have multi-head attention (Attn) modules, following nichol2021improved. $\mathbf{x}_{T_{trunc}}$ is generated from the Gaussian noise $\mathbf{z}$ via the flow-based noisy image generation network, which comprises a series of flow layers across multiple dimensions, in line with dong2022flow. Each flow layer contains conditional affine coupling, affine injector, invertible 1 × 1 convolution and activation normalization lugmayr2020srflow. The condition images $\mathbf{y}$ are infused into the flow layers through Condition Networks (Condition Net), which consist of convolution layers and LeakyReLU.
  • Figure 3: Qualitative comparisons of FTDDM against other methods at upscaling factor $s$=4.0. The two examples are: a tCr image from patient p1 and a Gly image from patient p2. FLAIR MRI provides the corresponding anatomical reference, with the tumor delineated by the red dashed line. Each metabolic map is shown alongside with its error map, except for ground truth. Note that the images below ground truth, framed in red, are the standard deviation maps of 50 FTDDM samples and can be used for uncertainty estimation (they are not error maps of ground truth).
  • Figure 4: Qualitative comparisons of FTDDM against other methods at upscaling factor $s$=8.0. The two examples are: a tCh image from patient p3 and a Glu image from patient p4.
  • Figure 5: Model performance of DDPM, DDPM with respacing, DPM-Solver++, TDPM and FTDDM at different numbers of sampling steps ($T_{respace}$ for DDPM respace, $T_{solver}$ for DPM-Solver++, $T_{trunc}$ for TDPM and FTDDM).
  • ...and 6 more figures