AdvLogo: Adversarial Patch Attack against Object Detectors based on Diffusion Models

Boming Miao; Chunxiao Li; Yao Zhu; Weixiang Sun; Zizhe Wang; Xiaoyi Wang; Chuanlong Xie

AdvLogo: Adversarial Patch Attack against Object Detectors based on Diffusion Models

Boming Miao, Chunxiao Li, Yao Zhu, Weixiang Sun, Zizhe Wang, Xiaoyi Wang, Chuanlong Xie

TL;DR

AdvLogo tackles adversarial patch attacks on object detectors by leveraging diffusion models to explore adversarial subspaces in semantic space. It guides the diffusion denoising process by perturbing the last-timestep latent $z_T$ in the frequency domain and optimizing unconditional embeddings $\Phi_T$, with a DDIM-based gradient approximation to reduce compute. The paper demonstrates that frequency-domain perturbations preserve image distribution and visual fidelity while achieving strong black-box transferability, and that jointly optimizing unconditional embeddings further boosts attack effectiveness. These findings highlight a practical pathway to high-quality adversarial patches and offer insights into semantic-space vulnerabilities of detectors.

Abstract

With the rapid development of deep learning, object detectors have demonstrated impressive performance; however, vulnerabilities still exist in certain scenarios. Current research exploring the vulnerabilities using adversarial patches often struggles to balance the trade-off between attack effectiveness and visual quality. To address this problem, we propose a novel framework of patch attack from semantic perspective, which we refer to as AdvLogo. Based on the hypothesis that every semantic space contains an adversarial subspace where images can cause detectors to fail in recognizing objects, we leverage the semantic understanding of the diffusion denoising process and drive the process to adversarial subareas by perturbing the latent and unconditional embeddings at the last timestep. To mitigate the distribution shift that exposes a negative impact on image quality, we apply perturbation to the latent in frequency domain with the Fourier Transform. Experimental results demonstrate that AdvLogo achieves strong attack performance while maintaining high visual quality.

AdvLogo: Adversarial Patch Attack against Object Detectors based on Diffusion Models

TL;DR

in the frequency domain and optimizing unconditional embeddings

, with a DDIM-based gradient approximation to reduce compute. The paper demonstrates that frequency-domain perturbations preserve image distribution and visual fidelity while achieving strong black-box transferability, and that jointly optimizing unconditional embeddings further boosts attack effectiveness. These findings highlight a practical pathway to high-quality adversarial patches and offer insights into semantic-space vulnerabilities of detectors.

Abstract

Paper Structure (20 sections, 15 equations, 3 figures, 4 tables, 1 algorithm)

This paper contains 20 sections, 15 equations, 3 figures, 4 tables, 1 algorithm.

Introduction
Related Work
Transfer-based Attack
Object Detectors
Patch Attack
Method
Preliminary
Overall Framework
Latent Frequency Domain
Unconditional Embeddings Optimization
Gradient Approximation
Experiment
Implementation Details
Main Results
Frequency Domain vs Spatial Domain
...and 5 more sections

Figures (3)

Figure 1: According to our hypothesis, the semantic space of "dog" contains an adversarial subspace. The visual appearances of two dogs from different regions of this space are distinct, making it impossible to transfer from one to the other under normal conditions. However, when significant noise is introduced, the distinctions between the noisy versions of the two dogs become less pronounced, making the transfer between these regions possible.
Figure 2: During the training stage, we initially obtain a noisy latent representation. This representation is transformed into the frequency domain using the Fourier Transform. Starting from this frequency domain, we retrieve the latent variable $z_T$ by applying the Inverse Fourier Transform. Subsequently, we apply a denoising process to generate the corresponding patch. This patch is rendered onto the target object to create adversarial images. These images are then evaluated by an object detector, yielding a detection loss. The frequency domain representation and unconditional embedding are iteratively optimized to minimize this loss. During the attack stage, when the optimized patch is applied to the target (e.g., a person), object detectors fail to recognize the presence of the target. As a contrast experiment, we apply a non-adversarial patch to the same location, which results in successful detection by the object detectors. This comparison underscores the effectiveness of our adversarial patch at evading detection.
Figure 3: Comparison of visual quality of different adversarial patches. Our proposed method is AdvLogo-Hybrid, demonstrates high visual quality. For the ablation study, we also present logos generated with different optimization strategies. Specifically, AdvLogo-Embedding is obtained by optimizing $\Phi_T$ alone, AdvLogo-Frequency is obtained by optimizing $\tilde{z}_T$ alone, and AdvLogo-Spatial is obtained by optimizing $z_T$ alone. Both NAP and AdvLogo-Hybrid exhibit high visual quality, with AdvLogo-Hybrid achieving a higher aesthetic score than NAP.

AdvLogo: Adversarial Patch Attack against Object Detectors based on Diffusion Models

TL;DR

Abstract

AdvLogo: Adversarial Patch Attack against Object Detectors based on Diffusion Models

Authors

TL;DR

Abstract

Table of Contents

Figures (3)