Table of Contents
Fetching ...

Distillation-Enhanced Physical Adversarial Attacks

Wei Liu, Yonglin Wu, Chaoqun Li, Zhuodong Liu, Huanqian Yan

TL;DR

This work proposes a novel physical adversarial attack method that leverages knowledge distillation to improve attack performance by 20%, while maintaining stealth, highlighting its practical value.

Abstract

The study of physical adversarial patches is crucial for identifying vulnerabilities in AI-based recognition systems and developing more robust deep learning models. While recent research has focused on improving patch stealthiness for greater practical applicability, achieving an effective balance between stealth and attack performance remains a significant challenge. To address this issue, we propose a novel physical adversarial attack method that leverages knowledge distillation. Specifically, we first define a stealthy color space tailored to the target environment to ensure smooth blending. Then, we optimize an adversarial patch in an unconstrained color space, which serves as the 'teacher' patch. Finally, we use an adversarial knowledge distillation module to transfer the teacher patch's knowledge to the 'student' patch, guiding the optimization of the stealthy patch. Experimental results show that our approach improves attack performance by 20%, while maintaining stealth, highlighting its practical value.

Distillation-Enhanced Physical Adversarial Attacks

TL;DR

This work proposes a novel physical adversarial attack method that leverages knowledge distillation to improve attack performance by 20%, while maintaining stealth, highlighting its practical value.

Abstract

The study of physical adversarial patches is crucial for identifying vulnerabilities in AI-based recognition systems and developing more robust deep learning models. While recent research has focused on improving patch stealthiness for greater practical applicability, achieving an effective balance between stealth and attack performance remains a significant challenge. To address this issue, we propose a novel physical adversarial attack method that leverages knowledge distillation. Specifically, we first define a stealthy color space tailored to the target environment to ensure smooth blending. Then, we optimize an adversarial patch in an unconstrained color space, which serves as the 'teacher' patch. Finally, we use an adversarial knowledge distillation module to transfer the teacher patch's knowledge to the 'student' patch, guiding the optimization of the stealthy patch. Experimental results show that our approach improves attack performance by 20%, while maintaining stealth, highlighting its practical value.
Paper Structure (19 sections, 12 equations, 5 figures, 2 tables)

This paper contains 19 sections, 12 equations, 5 figures, 2 tables.

Figures (5)

  • Figure 1: Left: The training detection box loss. A significant gap exists between the non-distillation and AdvPatch thys2019fooling methods. Our distillation-based method significantly narrows this gap. Right: Adversarial patches generated through various methods. (a) No adversarial patch. (b) AdvPatch. (c) non-Distillation. (d) Ours (Distillation). Notably, our distillation-based method enhances attack performance while preserving the same level of environmental concealment as the non-distilled method.
  • Figure 2: The overview of our proposed method. First, we extract the base colors from the environment to craft an adversarial patch that blends seamlessly with the environment. Next, we leverage a knowledge distillation approach, using a color-unconstrained adversarial patch to guide the generation of the stealthy patch, thereby enhancing its attack effectiveness.
  • Figure 3: Compared with other SOTA methods in physical experiments. The top row compares other stealthy patch generation methods, namely NatPatch, DAP, and their corresponding adversarial patches from left to right. The bottom row compares the non-distillation-based method AdvCat, our approach, and the adversarial patches. Our method maintains better stealth while achieving stronger attack performance (ASR), effectively deceiving the YOLOv3 detector.
  • Figure 4: The attack performance with/without the distillation module. As shown, the distillation module leads to a significant drop in mAP on the INRIA test set.
  • Figure 5: The YOLOv5 decline curve of detection box confidence under different distillation loss coefficients.