Table of Contents
Fetching ...

SCA: Improve Semantic Consistent in Unrestricted Adversarial Attacks via DDPM Inversion

Zihao Pan, Lifeng Chen, Weibin Wu, Yuhang Cao, Zibin Zheng

TL;DR

The paper tackles the problem of generating unrestricted adversarial examples that alter high-level semantics while remaining photorealistic. It introduces SCA, a two-stage framework combining Semantic Fixation Inversion and Semantically Guided Perturbation to imprint rich semantic priors from MLLMs into an edit-friendly diffusion latent space, and to optimize perturbations with gradient-free, semantically guided updates. With DPM Solver++ acceleration, SCA achieves substantial efficiency (~12x faster) while maintaining or improving semantic consistency (as measured by CLIP Score and LPIPS) and maintaining competitive attack success across CNNs and ViTs. The approach produces Semantic-Consistent Adversarial Examples (SCAE) that preserve original content and scene context, enabling more covert and transferable attacks. These results advance understanding of diffusion-based adversarial vulnerabilities and highlight potential avenues for robust defenses against semantically driven manipulation.

Abstract

Systems based on deep neural networks are vulnerable to adversarial attacks. Unrestricted adversarial attacks typically manipulate the semantic content of an image (e.g., color or texture) to create adversarial examples that are both effective and photorealistic. Recent works have utilized the diffusion inversion process to map images into a latent space, where high-level semantics are manipulated by introducing perturbations. However, they often result in substantial semantic distortions in the denoised output and suffer from low efficiency. In this study, we propose a novel framework called Semantic-Consistent Unrestricted Adversarial Attacks (SCA), which employs an inversion method to extract edit-friendly noise maps and utilizes a Multimodal Large Language Model (MLLM) to provide semantic guidance throughout the process. Under the condition of rich semantic information provided by MLLM, we perform the DDPM denoising process of each step using a series of edit-friendly noise maps and leverage DPM Solver++ to accelerate this process, enabling efficient sampling with semantic consistency. Compared to existing methods, our framework enables the efficient generation of adversarial examples that exhibit minimal discernible semantic changes. Consequently, we for the first time introduce Semantic-Consistent Adversarial Examples (SCAE). Extensive experiments and visualizations have demonstrated the high efficiency of SCA, particularly in being on average 12 times faster than the state-of-the-art attacks. Our code can be found at https://github.com/Pan-Zihao/SCA.

SCA: Improve Semantic Consistent in Unrestricted Adversarial Attacks via DDPM Inversion

TL;DR

The paper tackles the problem of generating unrestricted adversarial examples that alter high-level semantics while remaining photorealistic. It introduces SCA, a two-stage framework combining Semantic Fixation Inversion and Semantically Guided Perturbation to imprint rich semantic priors from MLLMs into an edit-friendly diffusion latent space, and to optimize perturbations with gradient-free, semantically guided updates. With DPM Solver++ acceleration, SCA achieves substantial efficiency (~12x faster) while maintaining or improving semantic consistency (as measured by CLIP Score and LPIPS) and maintaining competitive attack success across CNNs and ViTs. The approach produces Semantic-Consistent Adversarial Examples (SCAE) that preserve original content and scene context, enabling more covert and transferable attacks. These results advance understanding of diffusion-based adversarial vulnerabilities and highlight potential avenues for robust defenses against semantically driven manipulation.

Abstract

Systems based on deep neural networks are vulnerable to adversarial attacks. Unrestricted adversarial attacks typically manipulate the semantic content of an image (e.g., color or texture) to create adversarial examples that are both effective and photorealistic. Recent works have utilized the diffusion inversion process to map images into a latent space, where high-level semantics are manipulated by introducing perturbations. However, they often result in substantial semantic distortions in the denoised output and suffer from low efficiency. In this study, we propose a novel framework called Semantic-Consistent Unrestricted Adversarial Attacks (SCA), which employs an inversion method to extract edit-friendly noise maps and utilizes a Multimodal Large Language Model (MLLM) to provide semantic guidance throughout the process. Under the condition of rich semantic information provided by MLLM, we perform the DDPM denoising process of each step using a series of edit-friendly noise maps and leverage DPM Solver++ to accelerate this process, enabling efficient sampling with semantic consistency. Compared to existing methods, our framework enables the efficient generation of adversarial examples that exhibit minimal discernible semantic changes. Consequently, we for the first time introduce Semantic-Consistent Adversarial Examples (SCAE). Extensive experiments and visualizations have demonstrated the high efficiency of SCA, particularly in being on average 12 times faster than the state-of-the-art attacks. Our code can be found at https://github.com/Pan-Zihao/SCA.
Paper Structure (16 sections, 19 equations, 5 figures, 5 tables, 1 algorithm)

This paper contains 16 sections, 19 equations, 5 figures, 5 tables, 1 algorithm.

Figures (5)

  • Figure 1: The above shows the superiority of SCA in maintaining semantic consistency. (a) shows that our method achieves perfect reconstruction and introduces minimal perturbations later. It can be seen that we only made some changes to the text on the coffee cup and the coffee pattern, and the overall semantics and environment remained consistent. This shows that our method can accurately identify the subject in the image and attack it without affecting the overall image, greatly improving its concealment from human perception. (b) is a comparison between the current state-of-the-art method called ACA and SCA. Compared with the adversarial examples generated by SCA, ACA causes a large semantic deviation in the image.
  • Figure 2: The above shows some examples of failure of the existing state-of-the-art methods. We can see that although it can generate rich and diverse content with the help of text-to-image models, large semantic deviations often exist. The image on the left is clean, and the image on the right is a generated adversarial example. We can see that the person's information is lost in (a). The original image is described as "a woman sitting in a sled and a man standing beside her". However, the generated result loses the information of the person sitting in the sled. In addition, the object structure is abnormal in (b).
  • Figure 3: Pipeline of Semantic-Consistent Unrestricted Adversarial Attack. We first map the clean image into a latent space through Semantic Fixation Inversion, and then iteratively optimize the adversarial objective in the latent space under semantic guidance, causing the content of the image to shift in the direction of deceiving the model until the attack is successful.
  • Figure 4: Compared with other attacks, SCA generates the most natural adversarial examples and maintains a high degree of semantic consistency with the clean image.
  • Figure 5: Ablation study of the degree of semantic restriction. The caption of (b) is "a coffee cup with a heart-shaped design in the foam". The caption of (c) is "At the center of the frame is a white cup filled with a dark brown liquid, possibly coffee. The cup is placed on a white saucer, which is accompanied by a silver spoon. All these items rest on a tablecloth featuring a colorful floral pattern. On the cup, there's a heart-shaped latte art design, suggesting a level of care and attention given to the preparation of the drink."