Assessing UHD Image Quality from Aesthetics, Distortions, and Saliency

Wei Sun; Weixia Zhang; Yuqin Cao; Linhan Cao; Jun Jia; Zijian Chen; Zicheng Zhang; Xiongkuo Min; Guangtao Zhai

Assessing UHD Image Quality from Aesthetics, Distortions, and Saliency

Wei Sun, Weixia Zhang, Yuqin Cao, Linhan Cao, Jun Jia, Zijian Chen, Zicheng Zhang, Xiongkuo Min, Guangtao Zhai

TL;DR

A multi-branch deep neural network to assess the quality of UHD images from three perspectives: global aesthetic characteristics, local technical distortions, and salient content perception, which achieves the best performance on the UHD-IQA dataset.

Abstract

UHD images, typically with resolutions equal to or higher than 4K, pose a significant challenge for efficient image quality assessment (IQA) algorithms, as adopting full-resolution images as inputs leads to overwhelming computational complexity and commonly used pre-processing methods like resizing or cropping may cause substantial loss of detail. To address this problem, we design a multi-branch deep neural network (DNN) to assess the quality of UHD images from three perspectives: global aesthetic characteristics, local technical distortions, and salient content perception. Specifically, aesthetic features are extracted from low-resolution images downsampled from the UHD ones, which lose high-frequency texture information but still preserve the global aesthetics characteristics. Technical distortions are measured using a fragment image composed of mini-patches cropped from UHD images based on the grid mini-patch sampling strategy. The salient content of UHD images is detected and cropped to extract quality-aware features from the salient regions. We adopt the Swin Transformer Tiny as the backbone networks to extract features from these three perspectives. The extracted features are concatenated and regressed into quality scores by a two-layer multi-layer perceptron (MLP) network. We employ the mean square error (MSE) loss to optimize prediction accuracy and the fidelity loss to optimize prediction monotonicity. Experimental results show that the proposed model achieves the best performance on the UHD-IQA dataset while maintaining the lowest computational complexity, demonstrating its effectiveness and efficiency. Moreover, the proposed model won first prize in ECCV AIM 2024 UHD-IQA Challenge. The code is available at https://github.com/sunwei925/UIQA.

Assessing UHD Image Quality from Aesthetics, Distortions, and Saliency

TL;DR

Abstract

Paper Structure (14 sections, 12 equations, 3 figures, 7 tables)

This paper contains 14 sections, 12 equations, 3 figures, 7 tables.

Introduction
Related Work
General-purpose IQA Methods
High-resolution IQA Methods
Methods
Image Pre-processing Module
Feature Extraction Module
Quality Regression Module
Loss Function
Experiment
Experimental Protocol
Experimental Results
Ablation Studies
Conclusion

Figures (3)

Figure 1: The computational complexity (MACs) of existing IQA methods with different input resolutions.
Figure 2: The different image pre-processing methods for UHD images. (a) is the proposed method, which utilizes the resized image, the fragment image, and the salient patch to extract features of aesthetic, distortion, and salient content. (b) samples all non-overlapped image patches for feature extraction tan2024highly. (c) selects three representative patches with the highest texture complexity for feature extraction zhu2021perceptuallu2022deep.
Figure 3: The diagram of the proposed model. It consists of three modules: the image pre-processing module, the feature extraction module, and the quality regression module. We assess the quality of UHD images from three perspectives: global aesthetic characteristics, local technical distortions, and salient content perception, which are evaluated by the aesthetic assessment branch, distortion measurement branch, and salient content perception branch, respectively.

Assessing UHD Image Quality from Aesthetics, Distortions, and Saliency

TL;DR

Abstract

Assessing UHD Image Quality from Aesthetics, Distortions, and Saliency

Authors

TL;DR

Abstract

Table of Contents

Figures (3)