Table of Contents
Fetching ...

Cyclic 2.5D Perceptual Loss for Cross-Modal 3D Medical Image Synthesis: T1w MRI to Tau PET

Junho Moon, Symac Kim, Haejun Chung, Ikbeom Jang

TL;DR

This work tackles the problem of synthesizing tau PET images from T1-weighted MRI to support AD assessment by introducing a cyclic 2.5D perceptual loss that alternates plane-wise perceptual learning across axial, coronal, and sagittal views with progressively shrinking cycles. The method is combined with SSIM and MSE losses and paired with by-manufacturer SUVR standardization to mitigate inter-scanner variability. Evaluations across multiple architectures (including 3D U-Net, UNETR, SwinUNETR, CycleGAN, and Pix2Pix) show superior SSIM and ROI fidelity compared with 2.5D, 3D perceptual losses and existing PET-from-MRI approaches, indicating improved preservation of tau pathology. The approach offers a practical, noninvasive surrogate for tau burden that could aid pre-screening and triage in clinical workflows, with implications for broader access to tau-related assessments.

Abstract

There is a demand for medical image synthesis or translation to generate synthetic images of missing modalities from available data. This need stems from challenges such as restricted access to high-cost imaging devices, government regulations, or failure to follow up with patients or study participants. In medical imaging, preserving high-level semantic features is often more critical than achieving pixel-level accuracy. Perceptual loss functions are widely employed to train medical image synthesis or translation models, as they quantify differences in high-level image features using a pre-trained feature extraction network. While 3D and 2.5D perceptual losses are used in 3D medical image synthesis, they face challenges, such as the lack of pre-trained 3D models or difficulties in balancing loss reduction across different planes. In this work, we focus on synthesizing 3D tau PET images from 3D T1-weighted MR images. We propose a cyclic 2.5D perceptual loss that sequentially computes the 2D average perceptual loss for each of the axial, coronal, and sagittal planes over epochs, with the cycle duration gradually decreasing. Additionally, we process tau PET images using by-manufacturer standardization to enhance the preservation of high-SUVR regions indicative of tau pathology and mitigate SUVR variability caused by inter-manufacturer differences. We combine the proposed loss with SSIM and MSE losses and demonstrate its effectiveness in improving both quantitative and qualitative performance across various generative models, including U-Net, UNETR, SwinUNETR, CycleGAN, and Pix2Pix.

Cyclic 2.5D Perceptual Loss for Cross-Modal 3D Medical Image Synthesis: T1w MRI to Tau PET

TL;DR

This work tackles the problem of synthesizing tau PET images from T1-weighted MRI to support AD assessment by introducing a cyclic 2.5D perceptual loss that alternates plane-wise perceptual learning across axial, coronal, and sagittal views with progressively shrinking cycles. The method is combined with SSIM and MSE losses and paired with by-manufacturer SUVR standardization to mitigate inter-scanner variability. Evaluations across multiple architectures (including 3D U-Net, UNETR, SwinUNETR, CycleGAN, and Pix2Pix) show superior SSIM and ROI fidelity compared with 2.5D, 3D perceptual losses and existing PET-from-MRI approaches, indicating improved preservation of tau pathology. The approach offers a practical, noninvasive surrogate for tau burden that could aid pre-screening and triage in clinical workflows, with implications for broader access to tau-related assessments.

Abstract

There is a demand for medical image synthesis or translation to generate synthetic images of missing modalities from available data. This need stems from challenges such as restricted access to high-cost imaging devices, government regulations, or failure to follow up with patients or study participants. In medical imaging, preserving high-level semantic features is often more critical than achieving pixel-level accuracy. Perceptual loss functions are widely employed to train medical image synthesis or translation models, as they quantify differences in high-level image features using a pre-trained feature extraction network. While 3D and 2.5D perceptual losses are used in 3D medical image synthesis, they face challenges, such as the lack of pre-trained 3D models or difficulties in balancing loss reduction across different planes. In this work, we focus on synthesizing 3D tau PET images from 3D T1-weighted MR images. We propose a cyclic 2.5D perceptual loss that sequentially computes the 2D average perceptual loss for each of the axial, coronal, and sagittal planes over epochs, with the cycle duration gradually decreasing. Additionally, we process tau PET images using by-manufacturer standardization to enhance the preservation of high-SUVR regions indicative of tau pathology and mitigate SUVR variability caused by inter-manufacturer differences. We combine the proposed loss with SSIM and MSE losses and demonstrate its effectiveness in improving both quantitative and qualitative performance across various generative models, including U-Net, UNETR, SwinUNETR, CycleGAN, and Pix2Pix.
Paper Structure (13 sections, 11 equations, 6 figures, 6 tables, 2 algorithms)

This paper contains 13 sections, 11 equations, 6 figures, 6 tables, 2 algorithms.

Figures (6)

  • Figure 1: Overview of our proposed cyclic 2.5D perceptual loss. Following the generation of a 3D tau PET image from the given 3D T1w MR image, both the generated tau PET image $\hat{y}$ and the ground truth tau PET image $y$ are sliced into axial, coronal, or sagittal planes based on the current epoch. The sliced image pairs are then processed through the first 23 layers of a pre-trained VGG-16 model, and the corresponding feature maps are extracted. For each slice pair, the mean squared error between the paired feature maps is computed and averaged.
  • Figure 2: Overall data preprocessing procedures for T1w MR and tau PET images, including the by-manufacturer standardization of tau PET SUVR. The tau PET images were preprocessed through noise reduction via smoothing, spatial coregistration to baseline FDG PET, motion correction through frame averaging, alignment to the AC-PC plane, transformation into Talairach space, intensity normalization using the cerebellar gray matter, and adaptive smoothing. The T1w MRI preprocessing included nonparametric non-uniformity intensity normalization, Freesurfer intensity normalization, and conversion to a cubic format. Extraneous tissues and skulls were removed from the T1w MR and tau PET images. The tau PET images were subsequently co-registered with their corresponding T1w MR images. The intensities of the T1w MR images were scaled to a range of -1 to 1, while the tau PET SUVR values were standardized to a mean of 0 and a standard deviation of 1, adjusted according to the scanner manufacturer.
  • Figure 3: Cortical tau PET SUVR by manufacturers. The tau PET SUVR distribution across different manufacturers is presented for cortical regions, comparing participants with AD to those in the LMCI/MCI/EMCI group. The visualization depicts SUVR values at the 99th, 90th, and 75th percentiles. Across all manufacturers, SUVR values at these percentiles are consistently higher in AD participants than in the LMCI/MCI/EMCI group. Notably, the 99th percentile SUVR values vary across manufacturers in both groups. In all manufacturers, most SUVR values are concentrated around a lower range, approximately 1. Data from the MiE manufacturer for AD participants were unavailable in the collected dataset; thus, the corresponding SUVR distribution is not displayed. The cortical regions were defined using the Desikan-Killiany atlas.
  • Figure 4: Qualitative comparison of images generated by the 3D U-Net with different loss functions. Our method (the third column) demonstrates superior accuracy in capturing tau burden compared to other methods, clearly illustrating hyperphosphorylated tau deposition. The first column presents the input T1w MR images, the second illustrates the ground truth tau PET images, and the subsequent columns show the generated tau PET images. The SSIM and PSNR values between the ground truth tau PET image and the generated tau PET image are reported from the third column to the final column. Note: The SSIM or PSNR values represent performance across the 3D volume.
  • Figure 5: Visualization of multiple pairs comprising an input MR image, the corresponding ground truth tau PET image, and the generated tau PET image. The top-left and bottom-right pairs demonstrate a consistent alignment between the quantitative metrics and the detection of visual hot spots, defined as regions with high SUVR values in the ground truth tau PET images. However, the top-right and bottom-left pairs indicate cases where the metrics did not adequately reflect the visual hot spot detection. Note: The SSIM or PSNR values are calculated for individual 2D axial slices.
  • ...and 1 more figures