Table of Contents
Fetching ...

SG-JND: Semantic-Guided Just Noticeable Distortion Predictor For Image Compression

Linhan Cao, Wei Sun, Xiongkuo Min, Jun Jia, Zicheng Zhang, Zijian Chen, Yucheng Zhu, Lizhou Liu, Qiubo Chen, Jing Chen, Guangtao Zhai

TL;DR

The paper tackles predicting human-perceptible distortion thresholds (JND) for image compression by incorporating semantic information. It introduces SG-JND, a three-module architecture that splits images into patches, extracts multi-layer semantic features with a ResNet-50 backbone guided by cross-scale attention, and predicts patch-level quality and weights to derive a robust image-level JND, further refined by a sliding-window strategy. Empirical results on the MCL-JCI and KonJND-1k datasets show state-of-the-art accuracy, large portions of predictions closely matching ground truth JND, and high correlation with PSNR at the JND point. The work demonstrates the value of semantic-guided feature fusion for perceptual threshold estimation, with practical implications for content-aware image compression and quality assessment.

Abstract

Just noticeable distortion (JND), representing the threshold of distortion in an image that is minimally perceptible to the human visual system (HVS), is crucial for image compression algorithms to achieve a trade-off between transmission bit rate and image quality. However, traditional JND prediction methods only rely on pixel-level or sub-band level features, lacking the ability to capture the impact of image content on JND. To bridge this gap, we propose a Semantic-Guided JND (SG-JND) network to leverage semantic information for JND prediction. In particular, SG-JND consists of three essential modules: the image preprocessing module extracts semantic-level patches from images, the feature extraction module extracts multi-layer features by utilizing the cross-scale attention layers, and the JND prediction module regresses the extracted features into the final JND value. Experimental results show that SG-JND achieves the state-of-the-art performance on two publicly available JND datasets, which demonstrates the effectiveness of SG-JND and highlight the significance of incorporating semantic information in JND assessment.

SG-JND: Semantic-Guided Just Noticeable Distortion Predictor For Image Compression

TL;DR

The paper tackles predicting human-perceptible distortion thresholds (JND) for image compression by incorporating semantic information. It introduces SG-JND, a three-module architecture that splits images into patches, extracts multi-layer semantic features with a ResNet-50 backbone guided by cross-scale attention, and predicts patch-level quality and weights to derive a robust image-level JND, further refined by a sliding-window strategy. Empirical results on the MCL-JCI and KonJND-1k datasets show state-of-the-art accuracy, large portions of predictions closely matching ground truth JND, and high correlation with PSNR at the JND point. The work demonstrates the value of semantic-guided feature fusion for perceptual threshold estimation, with practical implications for content-aware image compression and quality assessment.

Abstract

Just noticeable distortion (JND), representing the threshold of distortion in an image that is minimally perceptible to the human visual system (HVS), is crucial for image compression algorithms to achieve a trade-off between transmission bit rate and image quality. However, traditional JND prediction methods only rely on pixel-level or sub-band level features, lacking the ability to capture the impact of image content on JND. To bridge this gap, we propose a Semantic-Guided JND (SG-JND) network to leverage semantic information for JND prediction. In particular, SG-JND consists of three essential modules: the image preprocessing module extracts semantic-level patches from images, the feature extraction module extracts multi-layer features by utilizing the cross-scale attention layers, and the JND prediction module regresses the extracted features into the final JND value. Experimental results show that SG-JND achieves the state-of-the-art performance on two publicly available JND datasets, which demonstrates the effectiveness of SG-JND and highlight the significance of incorporating semantic information in JND assessment.
Paper Structure (14 sections, 7 equations, 4 figures, 2 tables)

This paper contains 14 sections, 7 equations, 4 figures, 2 tables.

Figures (4)

  • Figure 1: The overall framework of the proposed SG-JND model.
  • Figure 2: The structure of the cross-scale mechanism.
  • Figure 3: Statistics of experimental results on the two datasets. (a)Histogram of the absolute error between the predicted JND and the ground truth JND on the MCL-JCI dataset. (b)Histogram of the absolute error between the predicted JND and the ground truth JND on the KonJND-1k dataset.
  • Figure 4: PSNR comparison between the ground truth JND images and predicted JND images on the MCL-JCI dataset (a), the JPEG compression format images of the KonJND-1k dataset (b), the BMP compression format images of the KonJND-1k dataset (c). The corresponding PLCCs are 0.9746, 0.9866, and 0.9719.