Iterative Window Mean Filter: Thwarting Diffusion-based Adversarial Purification

Hanrui Wang; Ruoxi Sun; Cunjian Chen; Minhui Xue; Lay-Ki Soon; Shuo Wang; Zhe Jin

Iterative Window Mean Filter: Thwarting Diffusion-based Adversarial Purification

Hanrui Wang, Ruoxi Sun, Cunjian Chen, Minhui Xue, Lay-Ki Soon, Shuo Wang, Zhe Jin

TL;DR

A novel and highly efficient non-deep-learning-based image filter called the Iterative Window Mean Filter (IWMF) and a new framework for adversarial purification, named IWMF-Diff, which integrates IWMF and denoising diffusion models are proposed.

Abstract

Face authentication systems have brought significant convenience and advanced developments, yet they have become unreliable due to their sensitivity to inconspicuous perturbations, such as adversarial attacks. Existing defenses often exhibit weaknesses when facing various attack algorithms and adaptive attacks or compromise accuracy for enhanced security. To address these challenges, we have developed a novel and highly efficient non-deep-learning-based image filter called the Iterative Window Mean Filter (IWMF) and proposed a new framework for adversarial purification, named IWMF-Diff, which integrates IWMF and denoising diffusion models. These methods can function as pre-processing modules to eliminate adversarial perturbations without necessitating further modifications or retraining of the target system. We demonstrate that our proposed methodologies fulfill four critical requirements: preserved accuracy, improved security, generalizability to various threats in different settings, and better resistance to adaptive attacks. This performance surpasses that of the state-of-the-art adversarial purification method, DiffPure.

Iterative Window Mean Filter: Thwarting Diffusion-based Adversarial Purification

TL;DR

Abstract

Paper Structure (31 sections, 12 equations, 17 figures, 19 tables, 4 algorithms)

This paper contains 31 sections, 12 equations, 17 figures, 19 tables, 4 algorithms.

Introduction
Related Work
Adversarial Defenses
Denoising Diffusion Models
IWMF-Diff Framework
Threat Model
Iterative Window Mean Filter (IWMF)
Restoring IWMF-blurred Images by Diffusion Models
Experimental Settings
Deep Learning Models for Face Authentication
Datasets
Adversarial Attacks to Defend
Benchmark Defenses
Adaptive Attacks
Evaluation Metrics
...and 16 more sections

Figures (17)

Figure 1: Adversarial attack against face authentication. (b) represents an impersonation attack.
Figure 2: Framework of IWMF-Diff.
Figure 3: Iterative window mean filter (IWMF).
Figure 4: The change after processed by IWMF. The edge is smoothed, but still observable for IWMF.
Figure 5: The distribution of the differences between the adversarial examples and source images typically exhibit maximum absolute values before defense, but after being processed through IWMF, the differences tend to cluster around zero.
...and 12 more figures

Iterative Window Mean Filter: Thwarting Diffusion-based Adversarial Purification

TL;DR

Abstract

Iterative Window Mean Filter: Thwarting Diffusion-based Adversarial Purification

Authors

TL;DR

Abstract

Table of Contents

Figures (17)