Random Sampling for Diffusion-based Adversarial Purification
Jiancheng Zhang, Peiran Dong, Yongyong Chen, Yin-Ping Zhao, Song Guo
TL;DR
The paper addresses the vulnerability of diffusion-based adversarial purification to adversarial perturbations by introducing random sampling, an DDIM-inspired mechanism that injects randomness at each diffusion step. It couples random sampling with mediator-guided conditional guidance to maintain consistency between purified and clean inputs, and introduces DiffAP as a practical baseline that outperforms state-of-the-art methods, including under accelerated sampling. Key findings show that increased sampling randomness correlates with improved robustness, and the mediator-guided approach yields stable, near-original classifier accuracy under attack, even with fewer denoising steps. The work provides a practical defense with strong robustness, tested under challenging asynchronous attacks, and highlights future exploration of unconditional randomness vs. guided approaches.
Abstract
Denoising Diffusion Probabilistic Models (DDPMs) have gained great attention in adversarial purification. Current diffusion-based works focus on designing effective condition-guided mechanisms while ignoring a fundamental problem, i.e., the original DDPM sampling is intended for stable generation, which may not be the optimal solution for adversarial purification. Inspired by the stability of the Denoising Diffusion Implicit Model (DDIM), we propose an opposite sampling scheme called random sampling. In brief, random sampling will sample from a random noisy space during each diffusion process, while DDPM and DDIM sampling will continuously sample from the adjacent or original noisy space. Thus, random sampling obtains more randomness and achieves stronger robustness against adversarial attacks. Correspondingly, we also introduce a novel mediator conditional guidance to guarantee the consistency of the prediction under the purified image and clean image input. To expand awareness of guided diffusion purification, we conduct a detailed evaluation with different sampling methods and our random sampling achieves an impressive improvement in multiple settings. Leveraging mediator-guided random sampling, we also establish a baseline method named DiffAP, which significantly outperforms state-of-the-art (SOTA) approaches in performance and defensive stability. Remarkably, under strong attack, our DiffAP even achieves a more than 20% robustness advantage with 10$\times$ sampling acceleration.
