Table of Contents
Fetching ...

Adversarial Training against Location-Optimized Adversarial Patches

Sukrut Rao, David Stutz, Bernt Schiele

TL;DR

This paper tackles robustness to clearly visible adversarial patches by introducing location-optimized patches and adversarial patch training. It defines image-specific, untargeted patches and develops strategies to optimize patch location, including full and random location optimization. Through extensive experiments on CIFAR-10 and GTSRB, the authors demonstrate that adversarial patch training with location optimization significantly improves robustness without sacrificing clean accuracy and even enhances robustness to universal patches. The findings suggest practical defense benefits for real-world scenarios like autonomous driving and highlight the importance of patch placement in adversarial robustness.

Abstract

Deep neural networks have been shown to be susceptible to adversarial examples -- small, imperceptible changes constructed to cause mis-classification in otherwise highly accurate image classifiers. As a practical alternative, recent work proposed so-called adversarial patches: clearly visible, but adversarially crafted rectangular patches in images. These patches can easily be printed and applied in the physical world. While defenses against imperceptible adversarial examples have been studied extensively, robustness against adversarial patches is poorly understood. In this work, we first devise a practical approach to obtain adversarial patches while actively optimizing their location within the image. Then, we apply adversarial training on these location-optimized adversarial patches and demonstrate significantly improved robustness on CIFAR10 and GTSRB. Additionally, in contrast to adversarial training on imperceptible adversarial examples, our adversarial patch training does not reduce accuracy.

Adversarial Training against Location-Optimized Adversarial Patches

TL;DR

This paper tackles robustness to clearly visible adversarial patches by introducing location-optimized patches and adversarial patch training. It defines image-specific, untargeted patches and develops strategies to optimize patch location, including full and random location optimization. Through extensive experiments on CIFAR-10 and GTSRB, the authors demonstrate that adversarial patch training with location optimization significantly improves robustness without sacrificing clean accuracy and even enhances robustness to universal patches. The findings suggest practical defense benefits for real-world scenarios like autonomous driving and highlight the importance of patch placement in adversarial robustness.

Abstract

Deep neural networks have been shown to be susceptible to adversarial examples -- small, imperceptible changes constructed to cause mis-classification in otherwise highly accurate image classifiers. As a practical alternative, recent work proposed so-called adversarial patches: clearly visible, but adversarially crafted rectangular patches in images. These patches can easily be printed and applied in the physical world. While defenses against imperceptible adversarial examples have been studied extensively, robustness against adversarial patches is poorly understood. In this work, we first devise a practical approach to obtain adversarial patches while actively optimizing their location within the image. Then, we apply adversarial training on these location-optimized adversarial patches and demonstrate significantly improved robustness on CIFAR10 and GTSRB. Additionally, in contrast to adversarial training on imperceptible adversarial examples, our adversarial patch training does not reduce accuracy.

Paper Structure

This paper contains 10 sections, 3 equations, 5 figures, 6 tables, 2 algorithms.

Figures (5)

  • Figure 1: Adversarial patch training.Left: Comparison of imperceptible adversarial examples (top) and adversarial patches (bottom), showing an adversarial example and the corresponding perturbation. On top, the perturbation is within $[-0.03, 0.03]$ and gray corresponds to no change. Middle: Adversarial patches with location optimization. We constrain patches to the outer (white) border of images to ensure label constancy (top left) and optimize the initial location locally (top right and bottom left). Repeating our attack with varying initial location reveals adversarial locations of our adversarially trained model, AT-RandLO in Fig. \ref{['fig:overlayheatmap:cifar']}. Right: Clean and robust test error for adversarial training on location-optimized patches in comparison to normal training and data augmentation with random patches. On both CIFAR10 and GTSRB, adversarial training improves robustness significantly, cf. Table \ref{['tab:main:cifar']}.
  • Figure 2: Our adversarial patch attack on CIFAR10 and GTSRB.Top: correctly classified examples; bottom: incorrectly classified after adding adversarial patch. Adversarial patches obtained against a normally trained ResNet-20 HeCVPR2016.
  • Figure 2: Ablation study of AP-Rand on CIFAR10. We report robust test error RErr RErr in % for each model against AP-Rand with varying number of iterations $T$ and random restarts $r$. More iterations or restarts generally lead to higher RErr RErr .
  • Figure 3: Robust test error vs. patch size. Robust test error RErr RErr in % and (square) patch size using AP-FullLO(50, 3) against Normal, i.e., adversarial patches with full location optimation, $50$ iterations and $3$ random restarts. We use $8\times8$, where RErr RErr on CIFAR10 stagnates.
  • Figure 4: Location heatmaps of our adversarial patch attacks. Heat maps corresponding to the final patch location using AP-FullLO(10,1000). Top: considering all $r=1000$ restarts; bottom: considering only successful restarts. See text for details.