Table of Contents
Fetching ...

Long-Tailed Backdoor Attack Using Dynamic Data Augmentation Operations

Lu Pang, Tao Sun, Weimin Lyu, Haibin Ling, Chao Chen

TL;DR

This paper first analyzes the influence of data imbalance on backdoor attack, and proposes an effective backdoor attack named Dynamic Data Augmentation Operation (D$^2$AO), and designs D$^2$AO selectors to select operations depending jointly on the class, sample type and sample features.

Abstract

Recently, backdoor attack has become an increasing security threat to deep neural networks and drawn the attention of researchers. Backdoor attacks exploit vulnerabilities in third-party pretrained models during the training phase, enabling them to behave normally for clean samples and mispredict for samples with specific triggers. Existing backdoor attacks mainly focus on balanced datasets. However, real-world datasets often follow long-tailed distributions. In this paper, for the first time, we explore backdoor attack on such datasets. Specifically, we first analyze the influence of data imbalance on backdoor attack. Based on our analysis, we propose an effective backdoor attack named Dynamic Data Augmentation Operation (D$^2$AO). We design D$^2$AO selectors to select operations depending jointly on the class, sample type (clean vs. backdoored) and sample features. Meanwhile, we develop a trigger generator to generate sample-specific triggers. Through simultaneous optimization of the backdoored model and trigger generator, guided by dynamic data augmentation operation selectors, we achieve significant advancements. Extensive experiments demonstrate that our method can achieve the state-of-the-art attack performance while preserving the clean accuracy.

Long-Tailed Backdoor Attack Using Dynamic Data Augmentation Operations

TL;DR

This paper first analyzes the influence of data imbalance on backdoor attack, and proposes an effective backdoor attack named Dynamic Data Augmentation Operation (DAO), and designs DAO selectors to select operations depending jointly on the class, sample type and sample features.

Abstract

Recently, backdoor attack has become an increasing security threat to deep neural networks and drawn the attention of researchers. Backdoor attacks exploit vulnerabilities in third-party pretrained models during the training phase, enabling them to behave normally for clean samples and mispredict for samples with specific triggers. Existing backdoor attacks mainly focus on balanced datasets. However, real-world datasets often follow long-tailed distributions. In this paper, for the first time, we explore backdoor attack on such datasets. Specifically, we first analyze the influence of data imbalance on backdoor attack. Based on our analysis, we propose an effective backdoor attack named Dynamic Data Augmentation Operation (DAO). We design DAO selectors to select operations depending jointly on the class, sample type (clean vs. backdoored) and sample features. Meanwhile, we develop a trigger generator to generate sample-specific triggers. Through simultaneous optimization of the backdoored model and trigger generator, guided by dynamic data augmentation operation selectors, we achieve significant advancements. Extensive experiments demonstrate that our method can achieve the state-of-the-art attack performance while preserving the clean accuracy.

Paper Structure

This paper contains 33 sections, 4 equations, 6 figures, 14 tables.

Figures (6)

  • Figure 1: (a) Traditional long-tailed learning needs to balance different (clean) classes. (b) Long-tailed backdoor attack further needs to balance backdoored and clean samples, where backdoored samples can be viewed as a special class. (In the illustrations, backdoored samples are created from corresponding clean samples, and then relabeled as Class B.)
  • Figure 2: Framework of our method including (a) Backdoored Model Training and (b) Operation Selectors Training. At stage (a), Clean Selector chooses class-wise augmentation operations for clean samples. Backdoored Selector calculates a probability distribution over $N$ data augmentation operations with fixed strength $q$, and chooses instance-wise operations based on the probabilities. Trigger Generator adapts to data augmentation of backdoored samples for generating effective trigger patterns. Trigger generator and backdoored model are optimized simultaneously. At stage (b), clean and backdoored selectors are updated based on strength score $s(k)^t$ and the proposed loss $\mathcal{L}_h$. At each epoch, (b) is first conducted and then (a) is conducted.
  • Figure 3: Class-wise performance comparison.
  • Figure 4: Results of resilience against Neural Cleanse and Fine-Pruning. (a) Models are flagged as backdoorded if Anomaly Index exceeds 2. (b) Our attack is resilient against fine-pruning.
  • Figure 5: Backdoored images using different attacks.
  • ...and 1 more figures