Diffusion Attack: Leveraging Stable Diffusion for Naturalistic Image Attacking
Qianyu Guo, Jiaming Fu, Yawen Lu, Dongming Gan
TL;DR
The paper tackles VR adversarial security by addressing the conspicuousness of typical attack visuals. It introduces Diffusion Attack, a diffusion-based framework that combines neural style transfer with a latent diffusion model to produce natural-looking adversarial images, using a mask to constrain edits and a joint loss $l_{total}$ that fuses $l_{content}$, $l_{style}$, $l_{adv}$, and $l_{smooth}$. By leveraging text-to-image prompts with Stable Diffusion and optimizing for targeted misclassification against classifiers such as Inception V3, the approach achieves high perceptual quality as measured by non-reference IQA metrics like NR-IQA and related aesthetics scores. The work demonstrates that naturalistic adversarial examples can preserve semantic integrity while maintaining strong attack efficacy, highlighting implications for VR security and the need for robust defense against style-transfer-based attacks.
Abstract
In Virtual Reality (VR), adversarial attack remains a significant security threat. Most deep learning-based methods for physical and digital adversarial attacks focus on enhancing attack performance by crafting adversarial examples that contain large printable distortions that are easy for human observers to identify. However, attackers rarely impose limitations on the naturalness and comfort of the appearance of the generated attack image, resulting in a noticeable and unnatural attack. To address this challenge, we propose a framework to incorporate style transfer to craft adversarial inputs of natural styles that exhibit minimal detectability and maximum natural appearance, while maintaining superior attack capabilities.
