Avoiding Quality Saturation in UGC Compression Using Denoised References
Xin Xiong, Samuel Fernández-Menduiña, Eduardo Pavez, Antonio Ortega, Neil Birkbeck, Balu Adsumilli
TL;DR
This work tackles quality saturation in UGC compression by reframing distortion with respect to a denoised reference (D-MSE) and introducing two pre-encoding detectors, DSD and RDSD, to predict a saturation point before encoding. DSD uses an input-dependent threshold derived from the D-MSE between the input UGC and its denoised counterpart, while RDSD estimates the saturation Lagrange multiplier and transfers it to the target codec to obtain a saturation QP. Through extensive experiments on YouTube-UGC data with AVC, the authors demonstrate robust QS avoidance and substantial BD-rate savings (roughly 8–21%) across multiple NRMs, without altering the UGC input fed to the encoder. The approach is codec-compatible, transform-domain friendly, and pre-encoding, offering a practical path to reduce wasted bitrate due to noise and artifacts in UGC pipelines.
Abstract
Video-sharing platforms must re-encode large volumes of noisy user-generated content (UGC) to meet streaming demands. However, conventional codecs, which aim to minimize the mean squared error (MSE) between the compressed and input videos, can cause quality saturation (QS) when applied to UGC, i.e., increasing the bitrate preserves input artifacts without improving visual quality. A direct approach to solve this problem is to detect QS by repeatedly evaluating a non-reference metric (NRM) on videos compressed with multiple codec parameters, which is inefficient. In this paper, we re-frame UGC compression and QS detection from the lens of noisy source coding theory: rather than using a NRM, we compute the MSE with respect to the denoised UGC, which serves as an alternative reference (D-MSE). Unlike MSE measured between the UGC input and the compressed UGC, D-MSE saturates at non-zero values as bitrates increase, a phenomenon we term distortion saturation (DS). Since D-MSE can be computed at the block level in the transform domain, we can efficiently detect D-MSE without coding and decoding with various parameters. We propose two methods for DS detection: distortion saturation detection (DSD), which relies on an input-dependent threshold derived from the D-MSE of the input UGC, and rate-distortion saturation detection (RDSD), which estimates the Lagrangian at the saturation point using a low-complexity compression method. Both methods work as a pre-processing step that can help standard-compliant codecs avoid QS in UGC compression. Experiments with AVC show that preventing encoding in the saturation region, i.e., avoiding encoding at QPs that result in QS according to our methods, achieves BD-rate savings of 8%-20% across multiple different NRMs, compared to a naïve baseline that encodes at the given input QP while ignoring QS.
