Evaluating and Preserving High-level Fidelity in Super-Resolution
Josep M. Rocafort, Shaolin Su, Alexandra Gomez-Villa, Javier Vazquez-Corral
TL;DR
The paper argues that evaluating SR requires assessing high-level fidelity—the preservation of semantics and key content—beyond traditional perceptual-quality metrics. It constructs the first annotated dataset of fidelity changes across five SR models (including diffusion-based methods) and analyzes how current models vary in semantic preservation. The authors show that foundation-model embeddings better capture high-level fidelity than traditional IQA metrics and introduce a cosine-similarity fidelity metric, which they then use to fine-tune an SR model. Results demonstrate gains in semantic fidelity and perceptual quality, highlighting the practical value of including fidelity feedback in SR evaluation and optimization.
Abstract
Recent image Super-Resolution (SR) models are achieving impressive effects in reconstructing details and delivering visually pleasant outputs. However, the overpowering generative ability can sometimes hallucinate and thus change the image content despite gaining high visual quality. This type of high-level change can be easily identified by humans yet not well-studied in existing low-level image quality metrics. In this paper, we establish the importance of measuring high-level fidelity for SR models as a complementary criterion to reveal the reliability of generative SR models. We construct the first annotated dataset with fidelity scores from different SR models, and evaluate how state-of-the-art (SOTA) SR models actually perform in preserving high-level fidelity. Based on the dataset, we then analyze how existing image quality metrics correlate with fidelity measurement, and further show that this high-level task can be better addressed by foundation models. Finally, by fine-tuning SR models based on our fidelity feedback, we show that both semantic fidelity and perceptual quality can be improved, demonstrating the potential value of our proposed criteria, both in model evaluation and optimization. We will release the dataset, code, and models upon acceptance.
