Table of Contents
Fetching ...

Multi-task Feature Enhancement Network for No-Reference Image Quality Assessment

Li Yu

TL;DR

A novel multi-task NR-IQA framework that combines a high-frequency extraction network, a quality estimation network, and a distortion-aware network that effectively integrates features from different tasks is introduced.

Abstract

Due to the scarcity of labeled samples in Image Quality Assessment (IQA) datasets, numerous recent studies have proposed multi-task based strategies, which explore feature information from other tasks or domains to boost the IQA task. Nevertheless, multi-task strategies based No-Reference Image Quality Assessment (NR-IQA) methods encounter several challenges. First, existing methods have not explicitly exploited texture details, which significantly influence the image quality. Second, multi-task methods conventionally integrate features through simple operations such as addition or concatenation, thereby diminishing the network's capacity to accurately represent distorted features. To tackle these challenges, we introduce a novel multi-task NR-IQA framework. Our framework consists of three key components: a high-frequency extraction network, a quality estimation network, and a distortion-aware network. The high-frequency extraction network is designed to guide the model's focus towards high-frequency information, which is highly related to the texture details. Meanwhile, the distortion-aware network extracts distortion-related features to distinguish different distortion types. To effectively integrate features from different tasks, a feature fusion module is developed based on an attention mechanism. Empirical results from five standard IQA databases confirm that our method not only achieves high performance but also exhibits robust generalization ability.

Multi-task Feature Enhancement Network for No-Reference Image Quality Assessment

TL;DR

A novel multi-task NR-IQA framework that combines a high-frequency extraction network, a quality estimation network, and a distortion-aware network that effectively integrates features from different tasks is introduced.

Abstract

Due to the scarcity of labeled samples in Image Quality Assessment (IQA) datasets, numerous recent studies have proposed multi-task based strategies, which explore feature information from other tasks or domains to boost the IQA task. Nevertheless, multi-task strategies based No-Reference Image Quality Assessment (NR-IQA) methods encounter several challenges. First, existing methods have not explicitly exploited texture details, which significantly influence the image quality. Second, multi-task methods conventionally integrate features through simple operations such as addition or concatenation, thereby diminishing the network's capacity to accurately represent distorted features. To tackle these challenges, we introduce a novel multi-task NR-IQA framework. Our framework consists of three key components: a high-frequency extraction network, a quality estimation network, and a distortion-aware network. The high-frequency extraction network is designed to guide the model's focus towards high-frequency information, which is highly related to the texture details. Meanwhile, the distortion-aware network extracts distortion-related features to distinguish different distortion types. To effectively integrate features from different tasks, a feature fusion module is developed based on an attention mechanism. Empirical results from five standard IQA databases confirm that our method not only achieves high performance but also exhibits robust generalization ability.

Paper Structure

This paper contains 17 sections, 9 equations, 2 figures, 7 tables.

Figures (2)

  • Figure 1: The framework of the proposed method. Our proposed framework consists of three branches: the Quality Estimation Network (QEN), the Distortion Aware Network (DAN), and the High Frequency Extraction Network (HFEN). The HFEN consists of vanilla convolution and high-frequency modules, with VAN as the backbone of QEN and ResNet-50 as the backbone of DAN. The QEN is the primary task, while the HFEN and the DAN are the auxiliary tasks. The DAN is pre-trained using Contrastive Learning to enhance the robustness of the distortion feature representation. The HFEN utilizes the high-frequency module to extract the high-frequency details, which helps the QEN to focus on key image components. In addition, an attention mechanism-based feature fusion method (AFF) is integrated for fusing distortion-aware features, and a feature fusion module (FFM) is proposed for adaptively fusing high-frequency features.
  • Figure 2: The gMAD competition results between VCRNet and our proposed method. (a) Fixed proposed at the high-quality level. (b) Fixed proposed at the low-quality level. (c) Fixed VCRNet at the high-quality level. (d) Fixed VCRNet at the low-quality level. To better compare the differences between the two images, the ground truth (MOS) of the images has been shown.