Table of Contents
Fetching ...

One Pixel is All I Need

Deng Siqin, Zhou Xiaoyi

TL;DR

This work reveals a ViT-specific backdoor vulnerability wherein quasi-triggers, guided by a Perturbation Sensitivity Distribution Map (PSDM), achieve high attack success with minimal poisoning by exploiting patch-centered sensitivity. The authors introduce WorstVIT, a single-pixel data-poisoning backdoor that leverages PSDM to reliably hijack predictions across inputs and even in real-world video scenarios, while remaining resistant to common defenses. They formalize quasi-triggers, demonstrate their superior robustness in ViTs over CNNs, and provide extensive white-box and cross-model evaluations, including Swin-VIT variants. The findings underscore a severe robustness gap for ViTs and motivate the development of ViT-aware defenses and detection methods to mitigate single-pixel backdoors in vision systems.

Abstract

Vision Transformers (ViTs) have achieved record-breaking performance in various visual tasks. However, concerns about their robustness against backdoor attacks have grown. Backdoor attacks involve associating a specific trigger with a target label, causing the model to predict the attacker-specified label when the trigger is present, while correctly identifying clean images.We found that ViTs exhibit higher attack success rates for quasi-triggers(patterns different from but similar to the original training triggers)compared to CNNs. Moreover, some backdoor features in clean samples can suppress the original trigger, making quasi-triggers more effective.To better understand and exploit these vulnerabilities, we developed a tool called the Perturbation Sensitivity Distribution Map (PSDM). PSDM computes and sums gradients over many inputs to show how sensitive the model is to small changes in the input. In ViTs, PSDM reveals a patch-like pattern where central pixels are more sensitive than edges. We use PSDM to guide the creation of quasi-triggers.Based on these findings, we designed "WorstVIT," a simple yet effective data poisoning backdoor for ViT models. This attack requires an extremely low poisoning rate, trains for just one epoch, and modifies a single pixel to successfully attack all validation images.

One Pixel is All I Need

TL;DR

This work reveals a ViT-specific backdoor vulnerability wherein quasi-triggers, guided by a Perturbation Sensitivity Distribution Map (PSDM), achieve high attack success with minimal poisoning by exploiting patch-centered sensitivity. The authors introduce WorstVIT, a single-pixel data-poisoning backdoor that leverages PSDM to reliably hijack predictions across inputs and even in real-world video scenarios, while remaining resistant to common defenses. They formalize quasi-triggers, demonstrate their superior robustness in ViTs over CNNs, and provide extensive white-box and cross-model evaluations, including Swin-VIT variants. The findings underscore a severe robustness gap for ViTs and motivate the development of ViT-aware defenses and detection methods to mitigate single-pixel backdoors in vision systems.

Abstract

Vision Transformers (ViTs) have achieved record-breaking performance in various visual tasks. However, concerns about their robustness against backdoor attacks have grown. Backdoor attacks involve associating a specific trigger with a target label, causing the model to predict the attacker-specified label when the trigger is present, while correctly identifying clean images.We found that ViTs exhibit higher attack success rates for quasi-triggers(patterns different from but similar to the original training triggers)compared to CNNs. Moreover, some backdoor features in clean samples can suppress the original trigger, making quasi-triggers more effective.To better understand and exploit these vulnerabilities, we developed a tool called the Perturbation Sensitivity Distribution Map (PSDM). PSDM computes and sums gradients over many inputs to show how sensitive the model is to small changes in the input. In ViTs, PSDM reveals a patch-like pattern where central pixels are more sensitive than edges. We use PSDM to guide the creation of quasi-triggers.Based on these findings, we designed "WorstVIT," a simple yet effective data poisoning backdoor for ViT models. This attack requires an extremely low poisoning rate, trains for just one epoch, and modifies a single pixel to successfully attack all validation images.

Paper Structure

This paper contains 24 sections, 3 equations, 9 figures, 11 tables.

Figures (9)

  • Figure 1: Perturbation Sensitivity Distribution Maps (PSDMs) for different configurations of models. The PSDMs exhibit a unique patch-like pattern, indicating that the model is more sensitive to perturbations at the centers of the patches compared to the edges.
  • Figure 2: Quasi-triggers in Vision Transformers (ViTs) exhibit good transferability across different patches.
  • Figure 3: Attack Success Rate for Differrent Models.
  • Figure 4: This image shows the attack effect on the WorstVIT model, with the triggers highlighted by red circles.
  • Figure 5: PSDMs of VIT Models Trained with Different Methods
  • ...and 4 more figures