YouTube SFV+HDR Quality Dataset

Yilin Wang; Joong Gon Yim; Neil Birkbeck; Balu Adsumilli

YouTube SFV+HDR Quality Dataset

Yilin Wang, Joong Gon Yim, Neil Birkbeck, Balu Adsumilli

TL;DR

This work addresses the need for large-scale SFV+HDR quality assessment by introducing the YouTube SFV+HDR dataset, comprising 4030 SFV contents (2030 SDR and 2000 HDR) across 10 categories with subjective MOS. It proposes a three-step sampling framework (sampling pool construction, feature space sampling, final content review) to maximize dataset representativeness and diversity. The authors analyze subjective MOS for SDR, HDR2SDR, and HDR, revealing content-category effects and a general MOS advantage for HDR-native content, while evaluating state-of-the-art UGC quality metrics (DOVER, FAST-VQA, FasterVQA) and finding strongest performance for FAST-VQA but notable gaps for HDR2SDR and Gameplay. The dataset and findings provide a valuable resource for advancing SFV+HDR quality assessment and guiding improvements to objective VQA models, with the dataset publicly available to support ongoing research.

Abstract

The popularity of Short form videos (SFV) has grown dramatically in the past few years, and has become a phenomenal video category with billions of viewers. Meanwhile, High Dynamic Range (HDR) as an advanced feature also becomes more and more popular on video sharing platforms. As a hot topic with huge impact, SFV and HDR bring new questions to video quality research: 1) is SFV+HDR quality assessment significantly different from traditional User Generated Content (UGC) quality assessment? 2) do objective quality metrics designed for traditional UGC still work well for SFV+HDR? To answer the above questions, we created the first large scale SFV+HDR dataset with reliable subjective quality scores, covering 10 popular content categories. Further, we also introduce a general sampling framework to maximize the representativeness of the dataset. We provided a comprehensive analysis of subjective quality scores for Short form SDR and HDR videos, and discuss the reliability of state-of-the-art UGC quality metrics and potential improvements.

YouTube SFV+HDR Quality Dataset

TL;DR

Abstract

Paper Structure (11 sections, 9 figures, 4 tables)

This paper contains 11 sections, 9 figures, 4 tables.

Introduction
Three Step video Sampling Framework
Sampling Pool Construction
Feature Space Sampling
Final Content Review
Subjective Data Analysis
Subjective Experiment
SDR MOS Analysis
HDR MOS Analysis
Objective Metric Performance
Conclusion

Figures (9)

Figure 1: Distributions of SI, TI, and UVQ for the entire pool (black) and three content categories Cooking (red), Health (yellow), and Gameplay (blue), whose distributions are significantly different from one another
Figure 2: Random sampling v.s. manual sampling.
Figure 3: Samples in high (left) and low (right) quality per category.
Figure 4: MOS distributions of all SDR videos, native SDR, and HDR2SDR respectively.
Figure 5: HDR2SDR samples with high MOS, even though there are some noticeable artifacts.
...and 4 more figures

YouTube SFV+HDR Quality Dataset

TL;DR

Abstract

YouTube SFV+HDR Quality Dataset

Authors

TL;DR

Abstract

Table of Contents

Figures (9)