Environmental Matching Attack Against Unmanned Aerial Vehicles Object Detection
Dehong Kong, Siyuan Liang, Wenqi Ren
TL;DR
This paper addresses the vulnerability of UAV object detectors to adversarial patches while highlighting environmental naturalness as a key constraint. It introduces Environmental Matching Attack (EMA), which uses a text-guided diffusion prior and scene matching to constrain patch color and appearance, while applying a perturbation rather than directly modifying the patch. By optimizing an additive perturbation under an $\ell_{\infty}$ bound and guiding the diffusion process with environment-aware prompts, EMA balances attack efficacy with visual naturalness. Across digital and physical experiments on UAV datasets, EMA achieves near state-of-the-art attack performance with significantly improved patch naturalness, suggesting more stealthy adversarial approaches in UAV contexts and informing defense strategies.
Abstract
Object detection techniques for Unmanned Aerial Vehicles (UAVs) rely on Deep Neural Networks (DNNs), which are vulnerable to adversarial attacks. Nonetheless, adversarial patches generated by existing algorithms in the UAV domain pay very little attention to the naturalness of adversarial patches. Moreover, imposing constraints directly on adversarial patches makes it difficult to generate patches that appear natural to the human eye while ensuring a high attack success rate. We notice that patches are natural looking when their overall color is consistent with the environment. Therefore, we propose a new method named Environmental Matching Attack(EMA) to address the issue of optimizing the adversarial patch under the constraints of color. To the best of our knowledge, this paper is the first to consider natural patches in the domain of UAVs. The EMA method exploits strong prior knowledge of a pretrained stable diffusion to guide the optimization direction of the adversarial patch, where the text guidance can restrict the color of the patch. To better match the environment, the contrast and brightness of the patch are appropriately adjusted. Instead of optimizing the adversarial patch itself, we optimize an adversarial perturbation patch which initializes to zero so that the model can better trade off attacking performance and naturalness. Experiments conducted on the DroneVehicle and Carpk datasets have shown that our work can reach nearly the same attack performance in the digital attack(no greater than 2 in mAP$\%$), surpass the baseline method in the physical specific scenarios, and exhibit a significant advantage in terms of naturalness in visualization and color difference with the environment.
