Table of Contents
Fetching ...

IQNet: Image Quality Assessment Guided Just Noticeable Difference Prefiltering For Versatile Video Coding

Yu-Han Sun, Chiang Lo-Hsuan Lee, Tian-Sheuan Chang

TL;DR

This work tackles the challenge of efficiently coding video by learning a fine-grained JND prefiltering strategy guided by no-reference image quality assessment. It introduces an IQA-guided JND dataset that embeds coding effects and perceptual enhancements, and a lightweight IQNet network that predicts JND adjustments in luminance with only 3K parameters, usable across QPs. The approach achieves substantial bitrate savings (up to 41% for all-intra and 53% for low-delay P, average 15% and 19%) with negligible perceptual loss, while delivering higher perceptual quality (VMAF) than prior CNN-based methods and requiring far smaller model size. The method offers a practical, scalable solution for VVC prefiltering in real-time HD video applications, reducing the need for extensive subjective testing.

Abstract

Image prefiltering with just noticeable distortion (JND) improves coding efficiency in a visual lossless way by filtering the perceptually redundant information prior to compression. However, real JND cannot be well modeled with inaccurate masking equations in traditional approaches or image-level subject tests in deep learning approaches. Thus, this paper proposes a fine-grained JND prefiltering dataset guided by image quality assessment for accurate block-level JND modeling. The dataset is constructed from decoded images to include coding effects and is also perceptually enhanced with block overlap and edge preservation. Furthermore, based on this dataset, we propose a lightweight JND prefiltering network, IQNet, which can be applied directly to different quantization cases with the same model and only needs 3K parameters. The experimental results show that the proposed approach to Versatile Video Coding could yield maximum/average bitrate savings of 41\%/15\% and 53\%/19\% for all-intra and low-delay P configurations, respectively, with negligible subjective quality loss. Our method demonstrates higher perceptual quality and a model size that is an order of magnitude smaller than previous deep learning methods.

IQNet: Image Quality Assessment Guided Just Noticeable Difference Prefiltering For Versatile Video Coding

TL;DR

This work tackles the challenge of efficiently coding video by learning a fine-grained JND prefiltering strategy guided by no-reference image quality assessment. It introduces an IQA-guided JND dataset that embeds coding effects and perceptual enhancements, and a lightweight IQNet network that predicts JND adjustments in luminance with only 3K parameters, usable across QPs. The approach achieves substantial bitrate savings (up to 41% for all-intra and 53% for low-delay P, average 15% and 19%) with negligible perceptual loss, while delivering higher perceptual quality (VMAF) than prior CNN-based methods and requiring far smaller model size. The method offers a practical, scalable solution for VVC prefiltering in real-time HD video applications, reducing the need for extensive subjective testing.

Abstract

Image prefiltering with just noticeable distortion (JND) improves coding efficiency in a visual lossless way by filtering the perceptually redundant information prior to compression. However, real JND cannot be well modeled with inaccurate masking equations in traditional approaches or image-level subject tests in deep learning approaches. Thus, this paper proposes a fine-grained JND prefiltering dataset guided by image quality assessment for accurate block-level JND modeling. The dataset is constructed from decoded images to include coding effects and is also perceptually enhanced with block overlap and edge preservation. Furthermore, based on this dataset, we propose a lightweight JND prefiltering network, IQNet, which can be applied directly to different quantization cases with the same model and only needs 3K parameters. The experimental results show that the proposed approach to Versatile Video Coding could yield maximum/average bitrate savings of 41\%/15\% and 53\%/19\% for all-intra and low-delay P configurations, respectively, with negligible subjective quality loss. Our method demonstrates higher perceptual quality and a model size that is an order of magnitude smaller than previous deep learning methods.
Paper Structure (22 sections, 18 figures, 4 tables)

This paper contains 22 sections, 18 figures, 4 tables.

Figures (18)

  • Figure 1: The flow of training data generation.
  • Figure 2: JND prefiltering flow for training data.
  • Figure 3: JND injection
  • Figure 4: DCT filter for an 8x8 DCT block.
  • Figure 5: Calculation of minor edges
  • ...and 13 more figures