AngleRoCL: Angle-Robust Concept Learning for Physically View-Invariant T2I Adversarial Patches
Wenjun Ji, Yuxiang Fu, Luyang Ying, Deng-Ping Fan, Yuyi Wang, Ming-Ming Cheng, Ivor Tsang, Qing Guo
TL;DR
This work studies the angle robustness of text-to-image adversarial patches and finds that diffusion-based prompts exhibit strong dependence on textual cues, with simple task-oriented instructions failing to improve angle robustness. It introduces Angle-Robust Concept Learning (AngleRoCL), which encodes a transferable angle-robust concept as a token embedding that can be inserted into prompts, enabling environment-free, detector-guided optimization to produce patches that maintain effectiveness across viewing angles. Empirical results in digital and physical settings show substantial improvements over baselines across multiple detectors, with average gains in attack efficacy across angles; ablations indicate the learned concept generalizes beyond trained angles and detectors. The approach also reveals how textual concepts map to physical robustness and highlights implications for defense against angle-invariant diffusion-based adversarial threats, supported by a code release for reproducibility.
Abstract
Cutting-edge works have demonstrated that text-to-image (T2I) diffusion models can generate adversarial patches that mislead state-of-the-art object detectors in the physical world, revealing detectors' vulnerabilities and risks. However, these methods neglect the T2I patches' attack effectiveness when observed from different views in the physical world (i.e., angle robustness of the T2I adversarial patches). In this paper, we study the angle robustness of T2I adversarial patches comprehensively, revealing their angle-robust issues, demonstrating that texts affect the angle robustness of generated patches significantly, and task-specific linguistic instructions fail to enhance the angle robustness. Motivated by the studies, we introduce Angle-Robust Concept Learning (AngleRoCL), a simple and flexible approach that learns a generalizable concept (i.e., text embeddings in implementation) representing the capability of generating angle-robust patches. The learned concept can be incorporated into textual prompts and guides T2I models to generate patches with their attack effectiveness inherently resistant to viewpoint variations. Through extensive simulation and physical-world experiments on five SOTA detectors across multiple views, we demonstrate that AngleRoCL significantly enhances the angle robustness of T2I adversarial patches compared to baseline methods. Our patches maintain high attack success rates even under challenging viewing conditions, with over 50% average relative improvement in attack effectiveness across multiple angles. This research advances the understanding of physically angle-robust patches and provides insights into the relationship between textual concepts and physical properties in T2I-generated contents. We released our code at https://github.com/tsingqguo/anglerocl.
