Table of Contents
Fetching ...

Degradation-Aware Hierarchical Termination for Blind Quality Enhancement of Compressed Video

Li Yu, Yingbo Zhao, Shiyu Wu, Siyue Yu, Moncef Gabbouj, Qingshan Liu

TL;DR

This work addresses blind quality enhancement for compressed video when quantization parameters ($QP$) are unknown, introducing a degradation-aware framework that combines a Degradation Representation Learning (DRL) module with a Hierarchical Termination-based Artifact Reduction (HTAR). DRL captures multiscale, spatially varying degradation via a degradation tensor $f_r$, degradation vector $f_v$, and degradation level $f_c$, trained with both contrastive and classification losses to disentangle degradation from content. The blind QECV network uses coarse alignment and STDA/DGLF-based artifact reduction blocks arranged in a hierarchical stack, terminating processing dynamically according to $f_c$ to balance quality and efficiency. Empirical results on MFQE 2.0 show state-of-the-art PSNR/SSIM improvements across HEVC and VVC, with notable generalization to unseen $QP$ values and substantial reductions in inference time for high-degradation cases. The approach also demonstrates robust degradation visualization and meaningful ablations, underscoring the effectiveness of combining degradation-aware feature modulation with a dual-branch temporal-spatial fusion strategy.

Abstract

Existing studies on Quality Enhancement for Compressed Video (QECV) predominantly rely on known Quantization Parameters (QPs), employing distinct enhancement models per QP setting, termed non-blind methods. However, in real-world scenarios involving transcoding or transmission, QPs may be partially or entirely unknown, limiting the applicability of such approaches and motivating the development of blind QECV techniques. Current blind methods generate degradation vectors via classification models with cross-entropy loss, using them as channel attention to guide artifact removal. However, these vectors capture only global degradation information and lack spatial details, hindering adaptation to varying artifact patterns at different spatial positions. To address these limitations, we propose a pretrained Degradation Representation Learning (DRL) module that decouples and extracts high-dimensional, multiscale degradation representations from video content to guide the artifact removal. Additionally, both blind and non-blind methods typically employ uniform architectures across QPs, hence, overlooking the varying computational demands inherent to different compression levels. We thus introduce a hierarchical termination mechanism that dynamically adjusts the number of artifact reduction stages based on the compression level. Experimental results demonstrate that the proposed approach significantly enhances performance, achieving a PSNR improvement of 110% (from 0.31 dB to 0.65 dB) over a competing state-of-the-art blind method at QP = 22. Furthermore, the proposed hierarchical termination mechanism reduces the average inference time at QP = 22 by half compared to QP = 42.

Degradation-Aware Hierarchical Termination for Blind Quality Enhancement of Compressed Video

TL;DR

This work addresses blind quality enhancement for compressed video when quantization parameters () are unknown, introducing a degradation-aware framework that combines a Degradation Representation Learning (DRL) module with a Hierarchical Termination-based Artifact Reduction (HTAR). DRL captures multiscale, spatially varying degradation via a degradation tensor , degradation vector , and degradation level , trained with both contrastive and classification losses to disentangle degradation from content. The blind QECV network uses coarse alignment and STDA/DGLF-based artifact reduction blocks arranged in a hierarchical stack, terminating processing dynamically according to to balance quality and efficiency. Empirical results on MFQE 2.0 show state-of-the-art PSNR/SSIM improvements across HEVC and VVC, with notable generalization to unseen values and substantial reductions in inference time for high-degradation cases. The approach also demonstrates robust degradation visualization and meaningful ablations, underscoring the effectiveness of combining degradation-aware feature modulation with a dual-branch temporal-spatial fusion strategy.

Abstract

Existing studies on Quality Enhancement for Compressed Video (QECV) predominantly rely on known Quantization Parameters (QPs), employing distinct enhancement models per QP setting, termed non-blind methods. However, in real-world scenarios involving transcoding or transmission, QPs may be partially or entirely unknown, limiting the applicability of such approaches and motivating the development of blind QECV techniques. Current blind methods generate degradation vectors via classification models with cross-entropy loss, using them as channel attention to guide artifact removal. However, these vectors capture only global degradation information and lack spatial details, hindering adaptation to varying artifact patterns at different spatial positions. To address these limitations, we propose a pretrained Degradation Representation Learning (DRL) module that decouples and extracts high-dimensional, multiscale degradation representations from video content to guide the artifact removal. Additionally, both blind and non-blind methods typically employ uniform architectures across QPs, hence, overlooking the varying computational demands inherent to different compression levels. We thus introduce a hierarchical termination mechanism that dynamically adjusts the number of artifact reduction stages based on the compression level. Experimental results demonstrate that the proposed approach significantly enhances performance, achieving a PSNR improvement of 110% (from 0.31 dB to 0.65 dB) over a competing state-of-the-art blind method at QP = 22. Furthermore, the proposed hierarchical termination mechanism reduces the average inference time at QP = 22 by half compared to QP = 42.

Paper Structure

This paper contains 12 sections, 12 equations, 5 figures, 7 tables.

Figures (5)

  • Figure 1: Overview of blind QECV methods. (a) shows an existing method that estimates a QP vector to guide artifact removal. (b) presents our method, which extracts fine-grained degradation representation and employs a hierarchical termination mechanism to adaptively perform multi-stage artifact reduction. The difference map in the center highlights spatially varying degradation, where the red region indicates more severe artifacts than the blue one. Our method achieves superior enhancement results on both regions over existing method.
  • Figure 2: The framework of the proposed method, which comprises (a) Degradation Representation Learning (DRL) module and (b) blind QECV network. The DRL module extracts multi-scale degradation information of the target frame, including degradation tensor (blue arrow), degradation vector (green arrow), and degradation level (orange arrow), which is then fed into the blind QECV network. This network includes three key components: a Coarse Alignment module for frame alignment, a Hierarchical Termination-based Artifact Reduction (HTAR) module, and a Quality Enhancement module. The STAR module incorporates up to five artifact reduction stages, each consisting of an STDA block and a DGLF block, enabling adaptive computational cost and efficient artifact removal based on degradation severity. Finally, the Quality Enhancement module further refines the spatial features of the target frame to improve visual quality.
  • Figure 3: Illustration of quality fluctuations for two test sequences compressed with QP 37. (Top: Class D, BasketballPass. Bottom: Class C, BasketballDrill.)
  • Figure 4: Visualization of Degradation Representation Learning (DRL) and Classification Learning (CL) on HEVC. (a) Clustering of DRL with seen QPs. (b) Clustering of DRL with unseen QPs. (c) Clustering of CL with seen QPs. (d) Clustering of CL with unseen QPs.
  • Figure 5: Detailed visualization on four sequences: BlowingBubbles (416x240), FourPeople (1280x720), RaceHorse(416x240), BasketballDrill(832x480).