EVD4UAV: An Altitude-Sensitive Benchmark to Evade Vehicle Detection in UAV
Huiming Sun, Jiacheng Guo, Zibo Meng, Tianyun Zhang, Jianwu Fang, Yuewei Lin, Hongkai Yu
TL;DR
This work introduces EVD4UAV, an altitude-sensitive UAV vehicle-detection benchmark with 6,284 images and 90,886 fine-grained vehicle annotations captured at 50m, 70m, and 90m to study patch-based evasion. It proposes a unified adversarial patch and evaluates white-box and two black-box attacks (CLIP-based and distribution-based) against Faster R-CNN, DETR, and YOLOv8, using a gradient-based patch update $P^{i+1} = P^{i} - \eta \cdot \nabla_{P} \mathcal{L}_a$ and detector-specific loss functions. Results show white-box patches outperform black-box patches, yet no method achieves robust altitude-insensitive performance across all heights, underscoring altitude as a critical factor in defense and attack design. The work highlights the need for altitude-aware defenses and provides a dataset and methodology to advance robustness in UAV-based vehicle detection.
Abstract
Vehicle detection in Unmanned Aerial Vehicle (UAV) captured images has wide applications in aerial photography and remote sensing. There are many public benchmark datasets proposed for the vehicle detection and tracking in UAV images. Recent studies show that adding an adversarial patch on objects can fool the well-trained deep neural networks based object detectors, posing security concerns to the downstream tasks. However, the current public UAV datasets might ignore the diverse altitudes, vehicle attributes, fine-grained instance-level annotation in mostly side view with blurred vehicle roof, so none of them is good to study the adversarial patch based vehicle detection attack problem. In this paper, we propose a new dataset named EVD4UAV as an altitude-sensitive benchmark to evade vehicle detection in UAV with 6,284 images and 90,886 fine-grained annotated vehicles. The EVD4UAV dataset has diverse altitudes (50m, 70m, 90m), vehicle attributes (color, type), fine-grained annotation (horizontal and rotated bounding boxes, instance-level mask) in top view with clear vehicle roof. One white-box and two black-box patch based attack methods are implemented to attack three classic deep neural networks based object detectors on EVD4UAV. The experimental results show that these representative attack methods could not achieve the robust altitude-insensitive attack performance.
