Table of Contents
Fetching ...

Causal Perception Inspired Representation Learning for Trustworthy Image Quality Assessment

Lei Wang, Desen Yuan

TL;DR

The paper addresses the vulnerability and lack of interpretability of deep IQA models under adversarial perturbations. It introduces CPRL, a causal-perception-inspired representation learning framework that separates CPR from N-CPR using a soft-ranking channel activation and a $PNS$-guided minimax objective to enforce causal relevance. The authors also propose a score reflection attack for evaluation and demonstrate that CPRL achieves superior robustness across four IQA datasets while providing explicit interpretability of the causal channels. The work advances trustworthy IQA by aligning representations with causally meaningful factors and mitigating spurious correlations, offering practical benefits for real-world image quality assessment under distribution shifts.

Abstract

Despite great success in modeling visual perception, deep neural network based image quality assessment (IQA) still remains unreliable in real-world applications due to its vulnerability to adversarial perturbations and the inexplicit black-box structure. In this paper, we propose to build a trustworthy IQA model via Causal Perception inspired Representation Learning (CPRL), and a score reflection attack method for IQA model. More specifically, we assume that each image is composed of Causal Perception Representation (CPR) and non-causal perception representation (N-CPR). CPR serves as the causation of the subjective quality label, which is invariant to the imperceptible adversarial perturbations. Inversely, N-CPR presents spurious associations with the subjective quality label, which may significantly change with the adversarial perturbations. To extract the CPR from each input image, we develop a soft ranking based channel-wise activation function to mediate the causally sufficient (beneficial for high prediction accuracy) and necessary (beneficial for high robustness) deep features, and based on intervention employ minimax game to optimize. Experiments on four benchmark databases show that the proposed CPRL method outperforms many state-of-the-art adversarial defense methods and provides explicit model interpretation.

Causal Perception Inspired Representation Learning for Trustworthy Image Quality Assessment

TL;DR

The paper addresses the vulnerability and lack of interpretability of deep IQA models under adversarial perturbations. It introduces CPRL, a causal-perception-inspired representation learning framework that separates CPR from N-CPR using a soft-ranking channel activation and a -guided minimax objective to enforce causal relevance. The authors also propose a score reflection attack for evaluation and demonstrate that CPRL achieves superior robustness across four IQA datasets while providing explicit interpretability of the causal channels. The work advances trustworthy IQA by aligning representations with causally meaningful factors and mitigating spurious correlations, offering practical benefits for real-world image quality assessment under distribution shifts.

Abstract

Despite great success in modeling visual perception, deep neural network based image quality assessment (IQA) still remains unreliable in real-world applications due to its vulnerability to adversarial perturbations and the inexplicit black-box structure. In this paper, we propose to build a trustworthy IQA model via Causal Perception inspired Representation Learning (CPRL), and a score reflection attack method for IQA model. More specifically, we assume that each image is composed of Causal Perception Representation (CPR) and non-causal perception representation (N-CPR). CPR serves as the causation of the subjective quality label, which is invariant to the imperceptible adversarial perturbations. Inversely, N-CPR presents spurious associations with the subjective quality label, which may significantly change with the adversarial perturbations. To extract the CPR from each input image, we develop a soft ranking based channel-wise activation function to mediate the causally sufficient (beneficial for high prediction accuracy) and necessary (beneficial for high robustness) deep features, and based on intervention employ minimax game to optimize. Experiments on four benchmark databases show that the proposed CPRL method outperforms many state-of-the-art adversarial defense methods and provides explicit model interpretation.
Paper Structure (29 sections, 14 equations, 6 figures, 6 tables, 1 algorithm)

This paper contains 29 sections, 14 equations, 6 figures, 6 tables, 1 algorithm.

Figures (6)

  • Figure 1: Demonstration of IQA model adversarial example generation). By adding an imperceptible perturbation, we can drastically change the predicted score of an IQA model for an image.
  • Figure 2: The CPRL module. We simply use this module to replace partial ReLU layers in the ResNet backbone to perform the IQA task.
  • Figure 3: Causal graphs of IQA learning. Grey nodes represent unobserved variables. (a) Cause graphs of the traditional FR-IQA task during training and testing phases. (b) Cause graphs of the traditional NR-IQA task during training and testing phases. (C) Cause graphs of NR-IQA learning under spurious correlations during the training phase. Due to shortcut learning, the network model learns spurious correlations from the path $X\leftarrow C\rightarrow Y$. (d) Cause graphs of NR-IQA learning during the testing phase. The non-aware variables in blue are not from the same distribution as the training set. During the test phase, the distribution of confounding variables $C$ in blue changes (adversarial attacks). (e) Cause graphs of NR-IQA adversarial learning. Adversarial training obtains new samples $X$ by changing the confounding variable $C$ and adding them to the training. Think of it as a backdoor intervention.
  • Figure 4: Output landscape on two-dimensional hyper-plane based on ResNet. specifically, one direction is the FGSM direction with a length of 1.0 pixels. Another direction is a random choice. Fig. \ref{['fig:simple a']} is the ResNet. Fig. \ref{['fig:simple b']} is the ResNet with CPRL. It can be found that the landscape is flat than original one. Regardless of the randomly selected perturbation direction or the perturbation direction of the adversarial attack, our landscape is flat, which empirically proves that CPRL is more robust.
  • Figure 5: Channel activations value (y-axis) of intermediate layers of ResNet and CPRL models. In each figure, natural and adversarial test examples are shown respectively. The channels are arranged in descending order of magnitude. We found that CPRL values with larger magnitudes are more stable than the original ResNet. This is because we use PNS optimization to give larger values a higher perceptual correlation score, and non-perceptual perturbations have less impact on this part.
  • ...and 1 more figures