Table of Contents
Fetching ...

MIGA: Mutual Information-Guided Attack on Denoising Models for Semantic Manipulation

Guanghao Li, Mingzhi Chen, Hao Yu, Shuting Dong, Wenhao Jiang, Ming Tang, Chun Yuan

TL;DR

The paper addresses the vulnerability of deep denoisers to semantic manipulation by introducing MIGA, a Mutual Information-Guided Attack that minimizes the task-relevant mutual information $I(x; D(x_{ extsc{n}}+\delta) \mid C)$. It formulates a three-loss objective with perturbation constraint, reconstruction fidelity, and a mutual-information term, and handles both known and unknown downstream tasks via cross-entropy and a MINE-based estimator, respectively. Empirical results across four denoisers and five datasets demonstrate that MIGA achieves perceptually clean outputs while systematically altering downstream semantics, revealing a security risk in real-world denoising systems. The work also proposes task-specific evaluation metrics and shows robustness to several defenses, underscoring the urgency of developing more resilient denoising techniques for safety-critical applications.

Abstract

Deep learning-based denoising models have been widely employed in vision tasks, functioning as filters to eliminate noise while retaining crucial semantic information. Additionally, they play a vital role in defending against adversarial perturbations that threaten downstream tasks. However, these models can be intrinsically susceptible to adversarial attacks due to their dependence on specific noise assumptions. Existing attacks on denoising models mainly aim at deteriorating visual clarity while neglecting semantic manipulation, rendering them either easily detectable or limited in effectiveness. In this paper, we propose Mutual Information-Guided Attack (MIGA), the first method designed to directly attack deep denoising models by strategically disrupting their ability to preserve semantic content via adversarial perturbations. By minimizing the mutual information between the original and denoised images, a measure of semantic similarity. MIGA forces the denoiser to produce perceptually clean yet semantically altered outputs. While these images appear visually plausible, they encode systematically distorted semantics, revealing a fundamental vulnerability in denoising models. These distortions persist in denoised outputs and can be quantitatively assessed through downstream task performance. We propose new evaluation metrics and systematically assess MIGA on four denoising models across five datasets, demonstrating its consistent effectiveness in disrupting semantic fidelity. Our findings suggest that denoising models are not always robust and can introduce security risks in real-world applications.

MIGA: Mutual Information-Guided Attack on Denoising Models for Semantic Manipulation

TL;DR

The paper addresses the vulnerability of deep denoisers to semantic manipulation by introducing MIGA, a Mutual Information-Guided Attack that minimizes the task-relevant mutual information . It formulates a three-loss objective with perturbation constraint, reconstruction fidelity, and a mutual-information term, and handles both known and unknown downstream tasks via cross-entropy and a MINE-based estimator, respectively. Empirical results across four denoisers and five datasets demonstrate that MIGA achieves perceptually clean outputs while systematically altering downstream semantics, revealing a security risk in real-world denoising systems. The work also proposes task-specific evaluation metrics and shows robustness to several defenses, underscoring the urgency of developing more resilient denoising techniques for safety-critical applications.

Abstract

Deep learning-based denoising models have been widely employed in vision tasks, functioning as filters to eliminate noise while retaining crucial semantic information. Additionally, they play a vital role in defending against adversarial perturbations that threaten downstream tasks. However, these models can be intrinsically susceptible to adversarial attacks due to their dependence on specific noise assumptions. Existing attacks on denoising models mainly aim at deteriorating visual clarity while neglecting semantic manipulation, rendering them either easily detectable or limited in effectiveness. In this paper, we propose Mutual Information-Guided Attack (MIGA), the first method designed to directly attack deep denoising models by strategically disrupting their ability to preserve semantic content via adversarial perturbations. By minimizing the mutual information between the original and denoised images, a measure of semantic similarity. MIGA forces the denoiser to produce perceptually clean yet semantically altered outputs. While these images appear visually plausible, they encode systematically distorted semantics, revealing a fundamental vulnerability in denoising models. These distortions persist in denoised outputs and can be quantitatively assessed through downstream task performance. We propose new evaluation metrics and systematically assess MIGA on four denoising models across five datasets, demonstrating its consistent effectiveness in disrupting semantic fidelity. Our findings suggest that denoising models are not always robust and can introduce security risks in real-world applications.

Paper Structure

This paper contains 39 sections, 1 theorem, 14 equations, 11 figures, 15 tables, 1 algorithm.

Key Result

Lemma 1

Increasing the cross-entropy loss $\mathcal{L}_{\text{CE}}(F(D(x_{\textsc{n}} + \delta)), C)$ reduces the task-relevant mutual information $I(x; D(x_{\textsc{n}} + \delta) \mid C)$ between the original and denoised images, conditioned on $C$.

Figures (11)

  • Figure 1: Adversarial attacks on denoising models in automated driving systems. A noisy "No Straight" sign $A$ may contain both natural noise and adversarial perturbations designed to mislead Automated Driving Model (ADM). (a) The denoiser typically restores it to a clean and correct form, allowing the ADM to make the correct "Turning Decision". (b) Traditional attacks on denoising models degrade image quality, making it blurry but rarely altering semantics. This often results in either a correct decision or an "Unrecognized Image Warning". (c) Our MIGA attack, with knowledge of the ADM, subtly alters semantics to induce an incorrect "Go-Straight Decision". (d) Without ADM knowledge, MIGA forces the denoised image’s semantics to resemble a reference "Go Straight" sign, leading to a wrong decision.
  • Figure 2: Architecture overview of MIGA. $\mathcal{L}_{\text{con}}$ enforces the imperceptible perturbation constraint; $\mathcal{L}_{\text{rec}}$ ensures image clarity using a clean reference image $x_{\text{ref}}$; $\mathcal{L}_{\text{MI}}$ characterizes the task-relevant mutual information. When the downstream task is known, the mutual information is estimated using $F$; when the downstream task is unknown, it is estimated using a mutual information estimator $T$.
  • Figure 3: Attack results on ImageNet-10. MIGA induces the denoising model to generate visually clean yet semantically altered outputs, leading pre-trained models to misclassify 'Leopard' as 'Maltese Dog' and 'Truck' as 'Car'.
  • Figure 4: Attack results on unknown tasks. MIGA can induce the image denoising process to shift towards the semantic direction of the reference by adding small perturbations. Zoom in to see details better.
  • Figure 5: Impact of different defense strategies on MIGA. A comparison of feature squeezing (Defense 1), non-local means denoising (Defense 2), and DiffPure (Defense 3) on image denoising performance.
  • ...and 6 more figures

Theorems & Definitions (2)

  • Lemma 1
  • proof