K-Space-Aware Cross-Modality Score for Synthesized Neuroimage Quality Assessment
Guoyang Xie, Jinbao Wang, Yawen Huang, Jiayi Lyu, Feng Zheng, Yefeng Zheng, Yaochu Jin
TL;DR
K-CROSS introduces a lesion- and frequency-aware metric for cross-modality neuroimage synthesis by jointly modeling tumor regions, k-space information, and shared anatomical structure. It uses a two-stage training regime with a complex U‑Net for k-space features, a tumor/structure pathway with a shared encoder, and two score networks to predict quality aligned with radiologist judgments. The method is validated on the NIRPS dataset consisting of 6,000 radiologist evaluations, showing superior agreement with expert assessments compared to traditional metrics like PSNR/SSIM and other IQA baselines, especially in capturing MR-specific properties. The work provides a scalable MRI-informed evaluation framework and a large radiologist-annotated dataset, with implications for improving cross-modality synthesis assessment in clinical contexts and beyond MRI-specific image generation tasks.
Abstract
The problem of how to assess cross-modality medical image synthesis has been largely unexplored. The most used measures like PSNR and SSIM focus on analyzing the structural features but neglect the crucial lesion location and fundamental k-space speciality of medical images. To overcome this problem, we propose a new metric K-CROSS to spur progress on this challenging problem. Specifically, K-CROSS uses a pre-trained multi-modality segmentation network to predict the lesion location, together with a tumor encoder for representing features, such as texture details and brightness intensities. To further reflect the frequency-specific information from the magnetic resonance imaging principles, both k-space features and vision features are obtained and employed in our comprehensive encoders with a frequency reconstruction penalty. The structure-shared encoders are designed and constrained with a similarity loss to capture the intrinsic common structural information for both modalities. As a consequence, the features learned from lesion regions, k-space, and anatomical structures are all captured, which serve as our quality evaluators. We evaluate the performance by constructing a large-scale cross-modality neuroimaging perceptual similarity (NIRPS) dataset with 6,000 radiologist judgments. Extensive experiments demonstrate that the proposed method outperforms other metrics, especially in comparison with the radiologists on NIRPS.
