Table of Contents
Fetching ...

Iterative Window Mean Filter: Thwarting Diffusion-based Adversarial Purification

Hanrui Wang, Ruoxi Sun, Cunjian Chen, Minhui Xue, Lay-Ki Soon, Shuo Wang, Zhe Jin

TL;DR

A novel and highly efficient non-deep-learning-based image filter called the Iterative Window Mean Filter (IWMF) and a new framework for adversarial purification, named IWMF-Diff, which integrates IWMF and denoising diffusion models are proposed.

Abstract

Face authentication systems have brought significant convenience and advanced developments, yet they have become unreliable due to their sensitivity to inconspicuous perturbations, such as adversarial attacks. Existing defenses often exhibit weaknesses when facing various attack algorithms and adaptive attacks or compromise accuracy for enhanced security. To address these challenges, we have developed a novel and highly efficient non-deep-learning-based image filter called the Iterative Window Mean Filter (IWMF) and proposed a new framework for adversarial purification, named IWMF-Diff, which integrates IWMF and denoising diffusion models. These methods can function as pre-processing modules to eliminate adversarial perturbations without necessitating further modifications or retraining of the target system. We demonstrate that our proposed methodologies fulfill four critical requirements: preserved accuracy, improved security, generalizability to various threats in different settings, and better resistance to adaptive attacks. This performance surpasses that of the state-of-the-art adversarial purification method, DiffPure.

Iterative Window Mean Filter: Thwarting Diffusion-based Adversarial Purification

TL;DR

A novel and highly efficient non-deep-learning-based image filter called the Iterative Window Mean Filter (IWMF) and a new framework for adversarial purification, named IWMF-Diff, which integrates IWMF and denoising diffusion models are proposed.

Abstract

Face authentication systems have brought significant convenience and advanced developments, yet they have become unreliable due to their sensitivity to inconspicuous perturbations, such as adversarial attacks. Existing defenses often exhibit weaknesses when facing various attack algorithms and adaptive attacks or compromise accuracy for enhanced security. To address these challenges, we have developed a novel and highly efficient non-deep-learning-based image filter called the Iterative Window Mean Filter (IWMF) and proposed a new framework for adversarial purification, named IWMF-Diff, which integrates IWMF and denoising diffusion models. These methods can function as pre-processing modules to eliminate adversarial perturbations without necessitating further modifications or retraining of the target system. We demonstrate that our proposed methodologies fulfill four critical requirements: preserved accuracy, improved security, generalizability to various threats in different settings, and better resistance to adaptive attacks. This performance surpasses that of the state-of-the-art adversarial purification method, DiffPure.
Paper Structure (31 sections, 12 equations, 17 figures, 19 tables, 4 algorithms)

This paper contains 31 sections, 12 equations, 17 figures, 19 tables, 4 algorithms.

Figures (17)

  • Figure 1: Adversarial attack against face authentication. (b) represents an impersonation attack.
  • Figure 2: Framework of IWMF-Diff.
  • Figure 3: Iterative window mean filter (IWMF).
  • Figure 4: The change after processed by IWMF. The edge is smoothed, but still observable for IWMF.
  • Figure 5: The distribution of the differences between the adversarial examples and source images typically exhibit maximum absolute values before defense, but after being processed through IWMF, the differences tend to cluster around zero.
  • ...and 12 more figures