Table of Contents
Fetching ...

Distillation-Driven Diffusion Model for Multi-Scale MRI Super-Resolution: Make 1.5T MRI Great Again

Zhe Wang, Yuhua Ru, Fabian Bauer, Aladine Chetouani, Fang Chen, Liping Zhang, Didier Hans, Rachid Jennane, Mohamed Jarraya, Yung Hsin Chen

TL;DR

This work tackles the practical gap between affordable 1.5T MRI and high-resolution 7T MRI by introducing a CLDM-based SR framework that uses gradient nonlinearity and bias field corrections as guidance. A novel progressive distillation strategy trains a lightweight Student to approximate the Teacher’s 7T-like outputs across multi-scale resolutions, dramatically reducing compute and memory needs while preserving accuracy. Experimental results on the HCP dataset show superior perceptual and structural fidelity over SOTA methods, with strong qualitative and quantitative gains, and clinical evaluations at MGH demonstrate potential diagnostic value in seizure and MS cases. The approach offers a scalable, deployable path to improve diagnostic capabilities in settings where high-field MRI is unavailable, with provisions for future multi-modal extension and broader clinical adoption.

Abstract

Magnetic Resonance Imaging (MRI) offers critical insights into microstructural details, however, the spatial resolution of standard 1.5T imaging systems is often limited. In contrast, 7T MRI provides significantly enhanced spatial resolution, enabling finer visualization of anatomical structures. Though this, the high cost and limited availability of 7T MRI hinder its widespread use in clinical settings. To address this challenge, a novel Super-Resolution (SR) model is proposed to generate 7T-like MRI from standard 1.5T MRI scans. Our approach leverages a diffusion-based architecture, incorporating gradient nonlinearity correction and bias field correction data from 7T imaging as guidance. Moreover, to improve deployability, a progressive distillation strategy is introduced. Specifically, the student model refines the 7T SR task with steps, leveraging feature maps from the inference phase of the teacher model as guidance, aiming to allow the student model to achieve progressively 7T SR performance with a smaller, deployable model size. Experimental results demonstrate that our baseline teacher model achieves state-of-the-art SR performance. The student model, while lightweight, sacrifices minimal performance. Furthermore, the student model is capable of accepting MRI inputs at varying resolutions without the need for retraining, significantly further enhancing deployment flexibility. The clinical relevance of our proposed method is validated using clinical data from Massachusetts General Hospital. Our code is available at https://github.com/ZWang78/SR.

Distillation-Driven Diffusion Model for Multi-Scale MRI Super-Resolution: Make 1.5T MRI Great Again

TL;DR

This work tackles the practical gap between affordable 1.5T MRI and high-resolution 7T MRI by introducing a CLDM-based SR framework that uses gradient nonlinearity and bias field corrections as guidance. A novel progressive distillation strategy trains a lightweight Student to approximate the Teacher’s 7T-like outputs across multi-scale resolutions, dramatically reducing compute and memory needs while preserving accuracy. Experimental results on the HCP dataset show superior perceptual and structural fidelity over SOTA methods, with strong qualitative and quantitative gains, and clinical evaluations at MGH demonstrate potential diagnostic value in seizure and MS cases. The approach offers a scalable, deployable path to improve diagnostic capabilities in settings where high-field MRI is unavailable, with provisions for future multi-modal extension and broader clinical adoption.

Abstract

Magnetic Resonance Imaging (MRI) offers critical insights into microstructural details, however, the spatial resolution of standard 1.5T imaging systems is often limited. In contrast, 7T MRI provides significantly enhanced spatial resolution, enabling finer visualization of anatomical structures. Though this, the high cost and limited availability of 7T MRI hinder its widespread use in clinical settings. To address this challenge, a novel Super-Resolution (SR) model is proposed to generate 7T-like MRI from standard 1.5T MRI scans. Our approach leverages a diffusion-based architecture, incorporating gradient nonlinearity correction and bias field correction data from 7T imaging as guidance. Moreover, to improve deployability, a progressive distillation strategy is introduced. Specifically, the student model refines the 7T SR task with steps, leveraging feature maps from the inference phase of the teacher model as guidance, aiming to allow the student model to achieve progressively 7T SR performance with a smaller, deployable model size. Experimental results demonstrate that our baseline teacher model achieves state-of-the-art SR performance. The student model, while lightweight, sacrifices minimal performance. Furthermore, the student model is capable of accepting MRI inputs at varying resolutions without the need for retraining, significantly further enhancing deployment flexibility. The clinical relevance of our proposed method is validated using clinical data from Massachusetts General Hospital. Our code is available at https://github.com/ZWang78/SR.

Paper Structure

This paper contains 25 sections, 9 equations, 4 figures, 7 tables, 2 algorithms.

Figures (4)

  • Figure 1: An example of axial view brain MR scans at 1.5T, 3T, and 7T field strengths.
  • Figure 2: The architecture of the baseline (teacher) model begins by extracting a slice at position $d$ along axis $a$ from a real MRI sequence $x$, denoted as $x_{a,d}$. This slice is encoded by the encoder $\mathcal{E}_1$, producing an initial latent representation $z_0$ that is progressively noised to a noised latent representation $z_T$. The corresponding slice $y_{a,d}$ from a 1.5T MRI sequence $y$, positioned at the same axis and depth, is used as a conditional input. This slice is encoded by a second encoder $\mathcal{E}_2$, and the output is concatenated with $z_T$ to form the initial input for the downsampling network. Throughout the denoising process, the bias field correction $b$ and the gradient nonlinearity correction $g$ are embedded at specific stages of the downsampling and upsampling phases, respectively, to provide guidance, culminating in the denoised latent $\hat{z}_0$. The decoder $\mathcal{D}$ then reconstructs the corresponding MRI slice $\hat{x}_{a,d}$ from $\hat{z}_0$. During inference, the process begins with $z_T$ and $y$, and is repeated for each slice at all positions $(\sum a, \sum d)$. The final 7T-like MRI sequence $\hat{x}$ is constructed by stacking all generated MRI slices in sequential order along each axis, followed by volume averaging.
  • Figure 3: The illustration showcases the progressive knowledge distillation process. The Teacher is highlighted, operating as a high-capacity baseline model capable of generating high-quality 7T-like MRI outputs from 1.5T inputs. The Student is depicted as a lightweight architecture designed to learn from the teacher model’s outputs. The arrows connecting the teacher and student models emphasize the flow of knowledge and guidance during training. The process leverages progressive distillation, where the student model incrementally refines its latent representations by matching intermediate targets provided by the teacher model.
  • Figure 4: The box plots visualize the different performance of the evaluated approaches using PSNR \ref{['PSNRR']} and SSIM \ref{['SSIM']}.