Subjective and Objective Quality Evaluation of Super-Resolution Enhanced Broadcast Images on a Novel SR-IQA Dataset
Yongrok Kim, Junha Shin, Juhyun Lee, Hyunsuk Ko
TL;DR
The paper addresses the challenge of evaluating SR images generated from low-quality broadcast content when SR can both distort and enhance the image. It introduces the SREB dataset, which applies multiple SR methods directly to native-resolution broadcast frames (no downsampling) at 2x and 4x, and collects subjective MOS via pairwise comparisons from 51 participants. A Bradley-Terry maximum-likelihood framework converts pairwise votes into MOS, and the study analyzes both subjective results and the performance of 9 NR-IQA and 2 RR-IQA metrics, highlighting ARNIQA’s strong correlation with MOS and the limitations of existing metrics. The findings advance understanding of how SR improvements interact with distortions in broadcast content and provide a benchmark to develop scaling-aware IQA metrics with practical impact for consumer media technologies.
Abstract
To display low-quality broadcast content on high-resolution screens in full-screen format, the application of Super-Resolution (SR), a key consumer technology, is essential. Recently, SR methods have been developed that not only increase resolution while preserving the original image information but also enhance the perceived quality. However, evaluating the quality of SR images generated from low-quality sources, such as SR-enhanced broadcast content, is challenging due to the need to consider both distortions and improvements. Additionally, assessing SR image quality without original high-quality sources presents another significant challenge. Unfortunately, there has been a dearth of research specifically addressing the Image Quality Assessment (IQA) of SR images under these conditions. In this work, we introduce a new IQA dataset for SR broadcast images in both 2K and 4K resolutions. We conducted a subjective quality evaluation to obtain the Mean Opinion Score (MOS) for these SR images and performed a comprehensive human study to identify the key factors influencing the perceived quality. Finally, we evaluated the performance of existing IQA metrics on our dataset. This study reveals the limitations of current metrics, highlighting the need for a more robust IQA metric that better correlates with the perceived quality of SR images.
