Adversarial Wear and Tear: Exploiting Natural Damage for Generating Physical-World Adversarial Examples

Samra Irshad; Seungkyu Lee; Nassir Navab; Hong Joo Lee; Seong Tae Kim

Adversarial Wear and Tear: Exploiting Natural Damage for Generating Physical-World Adversarial Examples

Samra Irshad, Seungkyu Lee, Nassir Navab, Hong Joo Lee, Seong Tae Kim

TL;DR

The paper tackles the vulnerability of deep neural networks to physical-world adversarial examples by introducing AdvWT, a class of perturbations embedded in natural wear and tear of outdoor signs. AdvWT uses a GAN-based image-to-image translation framework to learn latent damage styles via StarGAN-v2, then performs adversarial latent optimization to produce damaged signs that fool classifiers while remaining photorealistic. Experiments on two traffic-sign datasets demonstrate high attack success across digital and physical domains, with advantages in naturalness and robustness over existing methods. Moreover, training with AdvWT-augmented data improves out-of-distribution generalization to real-world damaged signs, suggesting practical benefits for robustness and maintenance in safety-critical perception systems.

Abstract

The presence of adversarial examples in the physical world poses significant challenges to the deployment of Deep Neural Networks in safety-critical applications such as autonomous driving. Most existing methods for crafting physical-world adversarial examples are ad-hoc, relying on temporary modifications like shadows, laser beams, or stickers that are tailored to specific scenarios. In this paper, we introduce a new class of physical-world adversarial examples, AdvWT, which draws inspiration from the naturally occurring phenomenon of `wear and tear', an inherent property of physical objects. Unlike manually crafted perturbations, `wear and tear' emerges organically over time due to environmental degradation, as seen in the gradual deterioration of outdoor signboards. To achieve this, AdvWT follows a two-step approach. First, a GAN-based, unsupervised image-to-image translation network is employed to model these naturally occurring damages, particularly in the context of outdoor signboards. The translation network encodes the characteristics of damaged signs into a latent `damage style code'. In the second step, we introduce adversarial perturbations into the style code, strategically optimizing its transformation process. This manipulation subtly alters the damage style representation, guiding the network to generate adversarial images where the appearance of damages remains perceptually realistic, while simultaneously ensuring their effectiveness in misleading neural networks. Through comprehensive experiments on two traffic sign datasets, we show that AdvWT effectively misleads DNNs in both digital and physical domains. AdvWT achieves an effective attack success rate, greater robustness, and a more natural appearance compared to existing physical-world adversarial examples. Additionally, integrating AdvWT into training enhances a model's generalizability to real-world damaged signs.

Adversarial Wear and Tear: Exploiting Natural Damage for Generating Physical-World Adversarial Examples

TL;DR

Abstract

Adversarial Wear and Tear: Exploiting Natural Damage for Generating Physical-World Adversarial Examples

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (9)