Table of Contents
Fetching ...

Low-Frequency Black-Box Backdoor Attack via Evolutionary Algorithm

Yanqi Qiao, Dazhuang Liu, Rui Wang, Kaitai Liang

TL;DR

This paper proposes a robust low-frequency backdoor attack in black-box setup that minimally perturbs spectrum components and maintains the perceptual similarity in spatial space simultaneously and capitalizes on the insight that optimal triggers can be located in low-frequency regions to maximize attack effectiveness, robustness against image transformation operations, and stealthiness in dual space.

Abstract

While convolutional neural networks (CNNs) have achieved success in computer vision tasks, it is vulnerable to backdoor attacks. Such attacks could mislead the victim model to make attacker-chosen prediction with a specific trigger pattern. Until now, the trigger injection of existing attacks is mainly limited to spatial domain. Recent works take advantage of perceptual properties of planting specific patterns in the frequency domain, which only reflect indistinguishable pixel-wise perturbations in pixel domain. However, in the black-box setup, the inaccessibility of training process often renders more complex trigger designs. Existing frequency attacks simply handcraft the magnitude of spectrum, introducing anomaly frequency disparities between clean and poisoned data and taking risks of being removed by image processing operations (such as lossy compression and filtering). In this paper, we propose a robust low-frequency black-box backdoor attack (LFBA), which minimally perturbs low-frequency components of frequency spectrum and maintains the perceptual similarity in spatial space simultaneously. The key insight of our attack restrict the search for the optimal trigger to low-frequency region that can achieve high attack effectiveness, robustness against image transformation defenses and stealthiness in dual space. We utilize simulated annealing (SA), a form of evolutionary algorithm, to optimize the properties of frequency trigger including the number of manipulated frequency bands and the perturbation of each frequency component, without relying on the knowledge from the victim classifier. Extensive experiments on real-world datasets verify the effectiveness and robustness of LFBA against image processing operations and the state-of-the-art backdoor defenses, as well as its inherent stealthiness in both spatial and frequency space, making it resilient against frequency inspection.

Low-Frequency Black-Box Backdoor Attack via Evolutionary Algorithm

TL;DR

This paper proposes a robust low-frequency backdoor attack in black-box setup that minimally perturbs spectrum components and maintains the perceptual similarity in spatial space simultaneously and capitalizes on the insight that optimal triggers can be located in low-frequency regions to maximize attack effectiveness, robustness against image transformation operations, and stealthiness in dual space.

Abstract

While convolutional neural networks (CNNs) have achieved success in computer vision tasks, it is vulnerable to backdoor attacks. Such attacks could mislead the victim model to make attacker-chosen prediction with a specific trigger pattern. Until now, the trigger injection of existing attacks is mainly limited to spatial domain. Recent works take advantage of perceptual properties of planting specific patterns in the frequency domain, which only reflect indistinguishable pixel-wise perturbations in pixel domain. However, in the black-box setup, the inaccessibility of training process often renders more complex trigger designs. Existing frequency attacks simply handcraft the magnitude of spectrum, introducing anomaly frequency disparities between clean and poisoned data and taking risks of being removed by image processing operations (such as lossy compression and filtering). In this paper, we propose a robust low-frequency black-box backdoor attack (LFBA), which minimally perturbs low-frequency components of frequency spectrum and maintains the perceptual similarity in spatial space simultaneously. The key insight of our attack restrict the search for the optimal trigger to low-frequency region that can achieve high attack effectiveness, robustness against image transformation defenses and stealthiness in dual space. We utilize simulated annealing (SA), a form of evolutionary algorithm, to optimize the properties of frequency trigger including the number of manipulated frequency bands and the perturbation of each frequency component, without relying on the knowledge from the victim classifier. Extensive experiments on real-world datasets verify the effectiveness and robustness of LFBA against image processing operations and the state-of-the-art backdoor defenses, as well as its inherent stealthiness in both spatial and frequency space, making it resilient against frequency inspection.
Paper Structure (16 sections, 8 equations, 9 figures, 4 tables, 1 algorithm)

This paper contains 16 sections, 8 equations, 9 figures, 4 tables, 1 algorithm.

Figures (9)

  • Figure 1: Comparison of poisoned images with their corresponding frequency disparities (amplified by $5\times$) to clean images of existing attacks. Left: clean images; mid: poisoned images from spatial domain based attacks including BadNets badnets, Blend Blended, SIG sig, IAB input-aware and ReFool refool; right: poisoned images from frequency domain based attacks including FTrojan ft, FIBA fiba and our LFBA attack. Although state-of-the-art frequency triggers achieve superior perceptual similarity than spatial triggers, they introduce anomaly frequency artifacts.
  • Figure 2: (a): The results of LFBA under NC on different datasets; (b)-(d): The attack effectiveness of LFBA against Fine-pruning.
  • Figure 3: The entropy distributions of LFBA against STRIP under 5 datasets.
  • Figure 4: Visualization of network attention by Grad-CAM on GTSRB, CIFAR-10, T-IMNET and CelebA. Compared to the visualization heatmaps of clean images, LFBA does not introduce any unusual regions.
  • Figure 5: Visualization of DCT spectra between clean and posioned samples under various spatial and frequency attacks on two different input-space datasets including CIFAR-10 ($32\times32$) and CelebA ($64\times64$). We randomly select 10000 samples from each dataset and showcase the averaged spectrum results.
  • ...and 4 more figures