Table of Contents
Fetching ...

Avoiding Quality Saturation in UGC Compression Using Denoised References

Xin Xiong, Samuel Fernández-Menduiña, Eduardo Pavez, Antonio Ortega, Neil Birkbeck, Balu Adsumilli

TL;DR

This work tackles quality saturation in UGC compression by reframing distortion with respect to a denoised reference (D-MSE) and introducing two pre-encoding detectors, DSD and RDSD, to predict a saturation point before encoding. DSD uses an input-dependent threshold derived from the D-MSE between the input UGC and its denoised counterpart, while RDSD estimates the saturation Lagrange multiplier and transfers it to the target codec to obtain a saturation QP. Through extensive experiments on YouTube-UGC data with AVC, the authors demonstrate robust QS avoidance and substantial BD-rate savings (roughly 8–21%) across multiple NRMs, without altering the UGC input fed to the encoder. The approach is codec-compatible, transform-domain friendly, and pre-encoding, offering a practical path to reduce wasted bitrate due to noise and artifacts in UGC pipelines.

Abstract

Video-sharing platforms must re-encode large volumes of noisy user-generated content (UGC) to meet streaming demands. However, conventional codecs, which aim to minimize the mean squared error (MSE) between the compressed and input videos, can cause quality saturation (QS) when applied to UGC, i.e., increasing the bitrate preserves input artifacts without improving visual quality. A direct approach to solve this problem is to detect QS by repeatedly evaluating a non-reference metric (NRM) on videos compressed with multiple codec parameters, which is inefficient. In this paper, we re-frame UGC compression and QS detection from the lens of noisy source coding theory: rather than using a NRM, we compute the MSE with respect to the denoised UGC, which serves as an alternative reference (D-MSE). Unlike MSE measured between the UGC input and the compressed UGC, D-MSE saturates at non-zero values as bitrates increase, a phenomenon we term distortion saturation (DS). Since D-MSE can be computed at the block level in the transform domain, we can efficiently detect D-MSE without coding and decoding with various parameters. We propose two methods for DS detection: distortion saturation detection (DSD), which relies on an input-dependent threshold derived from the D-MSE of the input UGC, and rate-distortion saturation detection (RDSD), which estimates the Lagrangian at the saturation point using a low-complexity compression method. Both methods work as a pre-processing step that can help standard-compliant codecs avoid QS in UGC compression. Experiments with AVC show that preventing encoding in the saturation region, i.e., avoiding encoding at QPs that result in QS according to our methods, achieves BD-rate savings of 8%-20% across multiple different NRMs, compared to a naïve baseline that encodes at the given input QP while ignoring QS.

Avoiding Quality Saturation in UGC Compression Using Denoised References

TL;DR

This work tackles quality saturation in UGC compression by reframing distortion with respect to a denoised reference (D-MSE) and introducing two pre-encoding detectors, DSD and RDSD, to predict a saturation point before encoding. DSD uses an input-dependent threshold derived from the D-MSE between the input UGC and its denoised counterpart, while RDSD estimates the saturation Lagrange multiplier and transfers it to the target codec to obtain a saturation QP. Through extensive experiments on YouTube-UGC data with AVC, the authors demonstrate robust QS avoidance and substantial BD-rate savings (roughly 8–21%) across multiple NRMs, without altering the UGC input fed to the encoder. The approach is codec-compatible, transform-domain friendly, and pre-encoding, offering a practical path to reduce wasted bitrate due to noise and artifacts in UGC pipelines.

Abstract

Video-sharing platforms must re-encode large volumes of noisy user-generated content (UGC) to meet streaming demands. However, conventional codecs, which aim to minimize the mean squared error (MSE) between the compressed and input videos, can cause quality saturation (QS) when applied to UGC, i.e., increasing the bitrate preserves input artifacts without improving visual quality. A direct approach to solve this problem is to detect QS by repeatedly evaluating a non-reference metric (NRM) on videos compressed with multiple codec parameters, which is inefficient. In this paper, we re-frame UGC compression and QS detection from the lens of noisy source coding theory: rather than using a NRM, we compute the MSE with respect to the denoised UGC, which serves as an alternative reference (D-MSE). Unlike MSE measured between the UGC input and the compressed UGC, D-MSE saturates at non-zero values as bitrates increase, a phenomenon we term distortion saturation (DS). Since D-MSE can be computed at the block level in the transform domain, we can efficiently detect D-MSE without coding and decoding with various parameters. We propose two methods for DS detection: distortion saturation detection (DSD), which relies on an input-dependent threshold derived from the D-MSE of the input UGC, and rate-distortion saturation detection (RDSD), which estimates the Lagrangian at the saturation point using a low-complexity compression method. Both methods work as a pre-processing step that can help standard-compliant codecs avoid QS in UGC compression. Experiments with AVC show that preventing encoding in the saturation region, i.e., avoiding encoding at QPs that result in QS according to our methods, achieves BD-rate savings of 8%-20% across multiple different NRMs, compared to a naïve baseline that encodes at the given input QP while ignoring QS.

Paper Structure

This paper contains 24 sections, 2 theorems, 17 equations, 13 figures, 1 table.

Key Result

Proposition 1

If $\theta$ satisfies equ_CostFunc and under high-rate regime assumptions gish1968aymptotically, we have

Figures (13)

  • Figure 1: (a) A frame from a UGC video; (b–d) rate-distortion (RD) curves from compressing the UGC video with AVC. VIDE tu2021ugc and UVQ wang2021rich are NRMs that predict visual quality. While MSE (b) converges to zero distortion (i.e., perfect approximation to the input) as bitrate increases, both VIDE (c) and UVQ (d) do not reach perfect quality at high bitrates. (e) A frame from the denoised UGC video; (f-h) RD curves from compressing this denoised UGC video with AVC. Similar to (c-d), VIDE (g) and UVQ (h) saturate at high bitrates when encoding the denoised UGC.
  • Figure 2: Different encoding setups for UGC compression.
  • Figure 3: Examples of UGC frames and their denoised version. From left to right: original UGC frame, and denoised UGC frames using Practical Blind Denoiser (PBD) zhang2023practical and FFmpeg's De-Blocking Filter (DBF) tomar2006converting. Both denoisers over-smooth the grass (top) and remove the net (bottom), demonstrating that current denoisers can alter the image content. Moreover, a suboptimal denoiser (e.g., DBF) can produce lower-quality results than the original UGC, as measured by non-reference metrics.
  • Figure 4: The proposed UGC compression system. Given a UGC video, our saturation detection method produces a saturation quality parameter $\text{QP}^*$, indicating where DS occurs. Ideally, if $\text{QP}^*$ exactly matches where QS happens, encoding the video with any $\text{QP}$ lower than $\text{QP}^*$ does not improve quality but wastes bitrate. Therefore, if a user specifies a desired $\text{QP}$ for encoding, we send the larger value between the user-input $\text{QP}$ and $\text{QP}^*$ to the codec.
  • Figure 5: The proposed UGC compression pipeline. For UGC input, we first uniformly sample frames and divide the sampled frames into blocks. Next, an orthogonal transform is applied to each block. In this work, we consider only the DCT. We then detect DS for each block individually, in terms of a quality parameter. Next, we derive a single saturation $\text{QP}^*$ across all blocks. This saturation $\text{QP}^*$ is compared with the user-input $\text{QP}$, and the larger one between the two is sent to the codec.
  • ...and 8 more figures

Theorems & Definitions (5)

  • Definition 3.1
  • Proposition 1
  • proof
  • Proposition 2
  • proof