VENOM: Text-driven Unrestricted Adversarial Example Generation with Diffusion Models
Hui Kuurila-Zhang, Haoyu Chen, Guoying Zhao
TL;DR
VENOM introduces a text-driven diffusion-based framework for unrestricted adversarial example generation, unifying image content creation and adversarial synthesis within a single reverse diffusion process. It stabilizes adversarial guidance via a momentum-enhanced gradient and an adaptive control switch, enabling high image fidelity while maintaining strong attack success against defenses. The approach supports both NAEs (from random noise) and UAEs (from reference images) and demonstrates superior quality and competitive ASR in white-box settings, with nuanced transferability trade-offs across black-box models. This work advances adversarial robustness research by providing a flexible, purely text-driven pathway to generate realistic, effective adversarial examples and by informing defense design.
Abstract
Adversarial attacks have proven effective in deceiving machine learning models by subtly altering input images, motivating extensive research in recent years. Traditional methods constrain perturbations within $l_p$-norm bounds, but advancements in Unrestricted Adversarial Examples (UAEs) allow for more complex, generative-model-based manipulations. Diffusion models now lead UAE generation due to superior stability and image quality over GANs. However, existing diffusion-based UAE methods are limited to using reference images and face challenges in generating Natural Adversarial Examples (NAEs) directly from random noise, often producing uncontrolled or distorted outputs. In this work, we introduce VENOM, the first text-driven framework for high-quality unrestricted adversarial examples generation through diffusion models. VENOM unifies image content generation and adversarial synthesis into a single reverse diffusion process, enabling high-fidelity adversarial examples without sacrificing attack success rate (ASR). To stabilize this process, we incorporate an adaptive adversarial guidance strategy with momentum, ensuring that the generated adversarial examples $x^*$ align with the distribution $p(x)$ of natural images. Extensive experiments demonstrate that VENOM achieves superior ASR and image quality compared to prior methods, marking a significant advancement in adversarial example generation and providing insights into model vulnerabilities for improved defense development.
