Table of Contents
Fetching ...

Adversarial Wear and Tear: Exploiting Natural Damage for Generating Physical-World Adversarial Examples

Samra Irshad, Seungkyu Lee, Nassir Navab, Hong Joo Lee, Seong Tae Kim

TL;DR

The paper tackles the vulnerability of deep neural networks to physical-world adversarial examples by introducing AdvWT, a class of perturbations embedded in natural wear and tear of outdoor signs. AdvWT uses a GAN-based image-to-image translation framework to learn latent damage styles via StarGAN-v2, then performs adversarial latent optimization to produce damaged signs that fool classifiers while remaining photorealistic. Experiments on two traffic-sign datasets demonstrate high attack success across digital and physical domains, with advantages in naturalness and robustness over existing methods. Moreover, training with AdvWT-augmented data improves out-of-distribution generalization to real-world damaged signs, suggesting practical benefits for robustness and maintenance in safety-critical perception systems.

Abstract

The presence of adversarial examples in the physical world poses significant challenges to the deployment of Deep Neural Networks in safety-critical applications such as autonomous driving. Most existing methods for crafting physical-world adversarial examples are ad-hoc, relying on temporary modifications like shadows, laser beams, or stickers that are tailored to specific scenarios. In this paper, we introduce a new class of physical-world adversarial examples, AdvWT, which draws inspiration from the naturally occurring phenomenon of `wear and tear', an inherent property of physical objects. Unlike manually crafted perturbations, `wear and tear' emerges organically over time due to environmental degradation, as seen in the gradual deterioration of outdoor signboards. To achieve this, AdvWT follows a two-step approach. First, a GAN-based, unsupervised image-to-image translation network is employed to model these naturally occurring damages, particularly in the context of outdoor signboards. The translation network encodes the characteristics of damaged signs into a latent `damage style code'. In the second step, we introduce adversarial perturbations into the style code, strategically optimizing its transformation process. This manipulation subtly alters the damage style representation, guiding the network to generate adversarial images where the appearance of damages remains perceptually realistic, while simultaneously ensuring their effectiveness in misleading neural networks. Through comprehensive experiments on two traffic sign datasets, we show that AdvWT effectively misleads DNNs in both digital and physical domains. AdvWT achieves an effective attack success rate, greater robustness, and a more natural appearance compared to existing physical-world adversarial examples. Additionally, integrating AdvWT into training enhances a model's generalizability to real-world damaged signs.

Adversarial Wear and Tear: Exploiting Natural Damage for Generating Physical-World Adversarial Examples

TL;DR

The paper tackles the vulnerability of deep neural networks to physical-world adversarial examples by introducing AdvWT, a class of perturbations embedded in natural wear and tear of outdoor signs. AdvWT uses a GAN-based image-to-image translation framework to learn latent damage styles via StarGAN-v2, then performs adversarial latent optimization to produce damaged signs that fool classifiers while remaining photorealistic. Experiments on two traffic-sign datasets demonstrate high attack success across digital and physical domains, with advantages in naturalness and robustness over existing methods. Moreover, training with AdvWT-augmented data improves out-of-distribution generalization to real-world damaged signs, suggesting practical benefits for robustness and maintenance in safety-critical perception systems.

Abstract

The presence of adversarial examples in the physical world poses significant challenges to the deployment of Deep Neural Networks in safety-critical applications such as autonomous driving. Most existing methods for crafting physical-world adversarial examples are ad-hoc, relying on temporary modifications like shadows, laser beams, or stickers that are tailored to specific scenarios. In this paper, we introduce a new class of physical-world adversarial examples, AdvWT, which draws inspiration from the naturally occurring phenomenon of `wear and tear', an inherent property of physical objects. Unlike manually crafted perturbations, `wear and tear' emerges organically over time due to environmental degradation, as seen in the gradual deterioration of outdoor signboards. To achieve this, AdvWT follows a two-step approach. First, a GAN-based, unsupervised image-to-image translation network is employed to model these naturally occurring damages, particularly in the context of outdoor signboards. The translation network encodes the characteristics of damaged signs into a latent `damage style code'. In the second step, we introduce adversarial perturbations into the style code, strategically optimizing its transformation process. This manipulation subtly alters the damage style representation, guiding the network to generate adversarial images where the appearance of damages remains perceptually realistic, while simultaneously ensuring their effectiveness in misleading neural networks. Through comprehensive experiments on two traffic sign datasets, we show that AdvWT effectively misleads DNNs in both digital and physical domains. AdvWT achieves an effective attack success rate, greater robustness, and a more natural appearance compared to existing physical-world adversarial examples. Additionally, integrating AdvWT into training enhances a model's generalizability to real-world damaged signs.

Paper Structure

This paper contains 27 sections, 11 equations, 9 figures, 4 tables, 1 algorithm.

Figures (9)

  • Figure 1: Adversarial Wear & Tear Examples. (a) Damaged traffic signs observed in real-world environments. (b) Original traffic Signs (first row) with correct predictions and adversarially damaged traffic signs generated by AdvWT (second row) with misclassified labels. Our proposed adversarial perturbations not only resemble real-world degradation but also successfully manipulate model predictions.
  • Figure 2: Comparison of physical-world adversarial examples: (a) Examples of AdvCam Duan2020, AdvLaser Duan_2021_CVPR, AdvShadow Zhong_2022_CVPR, AdvRefLight Wang2023, and AdvWT (ours). (b) Methods are compared in terms of gradual evolution, occurrence rate, pattern diversity, and persistency. Asterisks indicate presence: * (low), ** (moderate), *** (high)
  • Figure 3: Our approach for generating AdvWT examples. Step 1: Training phase of image-to-image translation model, which learns to represent both clean and damaged traffic signs. This framework enables the model to learn a latent representation of traffic sign degradation, capturing the natural variations in wear and tear while maintaining perceptual realism. Step 2: Generating damage styles by sampling latent noise $z$ from a Gaussian distribution and then mapping it into a structured latent space that encodes damage styles via style projector $M(z)$. We define the adversarial style code optimization objective $\Omega$. Step 3: Adversarial style code search algorithm iteratively applies small perturbations $\Delta{s}$ to damage style code $s_{d}$ to generate adversarial examples. Images generated by perturbing the style code are evaluated on the classifier $\mathcal{F}$ and the perturbation strength $\alpha$ is increased to maximize the attack success.
  • Figure 4: Trade-off between adversarial strength and perceptual similarity: As the perturbation factor $\alpha$ increases, Attack Success Rate (ASR) improves (red line), while structural similarity (SSIM) with the clean image decreases (blue line). This illustrates the inherent balance between damage severity and visual realism in AdvWT.
  • Figure 5: Visualization of adversarial examples generated by AdvWT and other methods: AdvWT can generate more diverse and realistic adversarial examples since the perturbation is embedded into the sign board through natural evolving process instead of being crafted by interaction of external projection or light.
  • ...and 4 more figures