Image-Difficulty-Aware Evaluation of Super-Resolution Models
Atakan Topaloglu, Ahmet Bilican, Cansu Korkmaz, A. Murat Tekalp
TL;DR
The paper tackles the inadequacy of average SR evaluation metrics, which can conceal how models perform on images of varying difficulty. It introduces two image-difficulty measures, $HFI$ and $RIEI$, to predict challenging images and proposes a difficulty-aware evaluation framework that uses quadrant-based PSNR analysis in the $HFI$-$RIEI$ plane alongside a localized artifact metric, $PSNR99$. Through case studies across different SR approaches, the authors demonstrate that these measures reveal performance patterns hidden by average PSNR and help identify where models excel or struggle on specific content types. The work has practical implications for guiding model design (e.g., mixture of experts) and for refinement of loss functions to mitigate artifacts on hard images, improving SR benchmarking and development.
Abstract
Image super-resolution models are commonly evaluated by average scores (over some benchmark test sets), which fail to reflect the performance of these models on images of varying difficulty and that some models generate artifacts on certain difficult images, which is not reflected by the average scores. We propose difficulty-aware performance evaluation procedures to better differentiate between SISR models that produce visually different results on some images but yield close average performance scores over the entire test set. In particular, we propose two image-difficulty measures, the high-frequency index and rotation-invariant edge index, to predict those test images, where a model would yield significantly better visual results over another model, and an evaluation method where these visual differences are reflected on objective measures. Experimental results demonstrate the effectiveness of the proposed image-difficulty measures and evaluation methodology.
