Table of Contents
Fetching ...

BVI-UGC: A Video Quality Database for User-Generated Content Transcoding

Zihao Qi, Chen Feng, Fan Zhang, Xiaozhong Xu, Shan Liu, David Bull

TL;DR

This work tackles the challenge of assessing perceptual video quality in user-generated content undergoing transcoding by introducing the BVI-UGC database. It builds a realistic UGC pipeline with 60 non-pristine references and 1,080 transcoded sequences derived from 60 source clips across 15 categories, accompanied by a crowdsourced MOS study from over 3,500 participants. By benchmarking 21 FR/NR VQA metrics, the study reveals that no metric achieves satisfactory correlation (SROCC below 0.6) with subjective scores in transcoding scenarios, underscoring a critical gap in current quality assessment tools. The dataset, open-source tools, and benchmarking framework together offer a valuable resource for developing more accurate FR/NR VQA methods tailored to UGC transcoding and streaming contexts.

Abstract

In recent years, user-generated content (UGC) has become one of the major video types consumed via streaming networks. Numerous research contributions have focused on assessing its visual quality through subjective tests and objective modeling. In most cases, objective assessments are based on a no-reference scenario, where the corresponding reference content is assumed not to be available. However, full-reference video quality assessment is also important for UGC in the delivery pipeline, particularly associated with the video transcoding process. In this context, we present a new UGC video quality database, BVI-UGC, for user-generated content transcoding, which contains 60 (non-pristine) reference videos and 1,080 test sequences. In this work, we simulated the creation of non-pristine reference sequences (with a wide range of compression distortions), typical of content uploaded to UGC platforms for transcoding. A comprehensive crowdsourced subjective study was then conducted involving more than 3,500 human participants. Based on this collected subjective data, we benchmarked the performance of 10 full-reference and 11 no-reference quality metrics. Our results demonstrate the poor performance (SROCC values are lower than 0.6) of these metrics in predicting the perceptual quality of UGC in two different scenarios (with or without a reference).

BVI-UGC: A Video Quality Database for User-Generated Content Transcoding

TL;DR

This work tackles the challenge of assessing perceptual video quality in user-generated content undergoing transcoding by introducing the BVI-UGC database. It builds a realistic UGC pipeline with 60 non-pristine references and 1,080 transcoded sequences derived from 60 source clips across 15 categories, accompanied by a crowdsourced MOS study from over 3,500 participants. By benchmarking 21 FR/NR VQA metrics, the study reveals that no metric achieves satisfactory correlation (SROCC below 0.6) with subjective scores in transcoding scenarios, underscoring a critical gap in current quality assessment tools. The dataset, open-source tools, and benchmarking framework together offer a valuable resource for developing more accurate FR/NR VQA methods tailored to UGC transcoding and streaming contexts.

Abstract

In recent years, user-generated content (UGC) has become one of the major video types consumed via streaming networks. Numerous research contributions have focused on assessing its visual quality through subjective tests and objective modeling. In most cases, objective assessments are based on a no-reference scenario, where the corresponding reference content is assumed not to be available. However, full-reference video quality assessment is also important for UGC in the delivery pipeline, particularly associated with the video transcoding process. In this context, we present a new UGC video quality database, BVI-UGC, for user-generated content transcoding, which contains 60 (non-pristine) reference videos and 1,080 test sequences. In this work, we simulated the creation of non-pristine reference sequences (with a wide range of compression distortions), typical of content uploaded to UGC platforms for transcoding. A comprehensive crowdsourced subjective study was then conducted involving more than 3,500 human participants. Based on this collected subjective data, we benchmarked the performance of 10 full-reference and 11 no-reference quality metrics. Our results demonstrate the poor performance (SROCC values are lower than 0.6) of these metrics in predicting the perceptual quality of UGC in two different scenarios (with or without a reference).
Paper Structure (17 sections, 11 figures, 3 tables, 1 algorithm)

This paper contains 17 sections, 11 figures, 3 tables, 1 algorithm.

Figures (11)

  • Figure 1: Illustration of the UGC video delivery pipeline. Source videos captured by users may contain various distortions due to poor quality equipment, unskilled cinematography and lossy compression. Captured videos are uploaded to UGC platforms where they are transcoded and streamed to video consumers. During the transcoding process, a perceptually accurate VQA metric is a key component in rate-distortion optimization.
  • Figure 2: Example frames in selected videos from 15 UGC catogories defined in this database. Every category contains both landscape and portrait videos (the ratio is approximately 2:1).
  • Figure 3: Feature distribution of the source content in the BVI-UGC database. (Left) SI versus TI; (right) SI versus CF.
  • Figure 4: Illustration of the content generation process for the BVI-UGC database, which contains 60 non-pristine reference and 1080 transcoded sequences.
  • Figure 5: Sample blocks from the high quality source, non-pristine reference and transcoded videos (for the same content).
  • ...and 6 more figures