A Deep-Learning-Based Label-free No-Reference Image Quality Assessment Metric: Application in Sodium MRI Denoising
Shuaiyu Yuan, Tristan Whitmarsh, Dimitri A Kessler, Otso Arponen, Mary A McLean, Gabrielle Baxter, Frank Riemer, Aneurin J Kennerley, William J Brackenbury, Fiona J Gilbert, Joshua D Kaggie
TL;DR
This work tackles the lack of ground-truth references for evaluating sodium MRI denoising by introducing Model Specialization Metric (MSM), a label-free no-reference IQA method that leverages the degradation in DL model predictions when input data diverges from training distributions. MSM computes the distance between an input image and the model’s prediction, using backbones such as U-net and SwinIR, and is trained on clean images to map ground truth to itself. Across proton T$_1$w and sodium MRI datasets, MSM demonstrates competitive SRCC/PLCC performance against FR- and NR-IQA baselines and shows substantial agreement with expert judgments for denoised sodium MR images, particularly with the SwinIR backbone (average $\kappa$ ≈ 0.653). The results support MSM as a practical, ground-truth-free tool for QC of denoised sodium MRI, with implications for clinical reliability and workflow efficiency, and motivate future hybrid architectures to further improve robustness across distortions.
Abstract
New multinuclear MRI techniques, such as sodium MRI, generally suffer from low image quality due to an inherently low signal. Postprocessing methods, such as image denoising, have been developed for image enhancement. However, the assessment of these enhanced images is challenging especially considering when there is a lack of high resolution and high signal images as reference, such as in sodium MRI. No-reference Image Quality Assessment (NR-IQA) metrics are approaches to solve this problem. Existing learning-based NR-IQA metrics rely on labels derived from subjective human opinions or metrics like Signal-to-Noise Ratio (SNR), which are either time-consuming or lack accurate ground truths, resulting in unreliable assessment. We note that deep learning (DL) models have a unique characteristic in that they are specialized to a characteristic training set, meaning that deviations between the input testing data from the training data will reduce prediction accuracy. Therefore, we propose a novel DL-based NR-IQA metric, the Model Specialization Metric (MSM), which does not depend on ground-truth images or labels. MSM measures the difference between the input image and the model's prediction for evaluating the quality of the input image. Experiments conducted on both simulated distorted proton T1-weighted MR images and denoised sodium MR images demonstrate that MSM exhibits a superior evaluation performance on various simulated noises and distortions. MSM also has a substantial agreement with the expert evaluations, achieving an averaged Cohen's Kappa coefficient of 0.6528, outperforming the existing NR-IQA metrics.
