Divide and Conquer: Heterogeneous Noise Integration for Diffusion-based Adversarial Purification

Gaozheng Pei; Shaojie Lyu; Gong Chen; Ke Ma; Qianqian Xu; Yingfei Sun; Qingming Huang

Divide and Conquer: Heterogeneous Noise Integration for Diffusion-based Adversarial Purification

Gaozheng Pei, Shaojie Lyu, Gong Chen, Ke Ma, Qianqian Xu, Yingfei Sun, Qingming Huang

TL;DR

This work addresses the challenge of purifying adversarial perturbations with diffusion models without sacrificing semantic content. It introduces a heterogeneous forward process guided by neural attention, applying stronger noise to regions the model relies on and lighter noise elsewhere, complemented by a two-stage heterogeneous denoising that performs inpainting-like restoration before standard diffusion sampling. To counter strong adaptive attacks, the method replaces multi-step resampling with a single-step, DDIM-like update, substantially reducing time and memory costs. Across CIFAR-10, SVHN, and ImageNet, the approach yields consistent improvements in standard and robust accuracy over prior diffusion-based and training-based defenses, while enabling feasible gradient-based evaluation on commodity GPUs.

Abstract

Existing diffusion-based purification methods aim to disrupt adversarial perturbations by introducing a certain amount of noise through a forward diffusion process, followed by a reverse process to recover clean examples. However, this approach is fundamentally flawed: the uniform operation of the forward process across all pixels compromises normal pixels while attempting to combat adversarial perturbations, resulting in the target model producing incorrect predictions. Simply relying on low-intensity noise is insufficient for effective defense. To address this critical issue, we implement a heterogeneous purification strategy grounded in the interpretability of neural networks. Our method decisively applies higher-intensity noise to specific pixels that the target model focuses on while the remaining pixels are subjected to only low-intensity noise. This requirement motivates us to redesign the sampling process of the diffusion model, allowing for the effective removal of varying noise levels. Furthermore, to evaluate our method against strong adaptative attack, our proposed method sharply reduces time cost and memory usage through a single-step resampling. The empirical evidence from extensive experiments across three datasets demonstrates that our method outperforms most current adversarial training and purification techniques by a substantial margin.

Divide and Conquer: Heterogeneous Noise Integration for Diffusion-based Adversarial Purification

TL;DR

Abstract

Divide and Conquer: Heterogeneous Noise Integration for Diffusion-based Adversarial Purification

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (9)