Table of Contents
Fetching ...

DeepSweep: An Evaluation Framework for Mitigating DNN Backdoor Attacks using Data Augmentation

Han Qiu, Yi Zeng, Shangwei Guo, Tianwei Zhang, Meikang Qiu, Bhavani Thuraisingham

TL;DR

Backdoor attacks on DNNs arising from untrusted data and models pose critical security risks. The authors propose DeepSweep, a data-augmentation–driven framework that automatically discovers a unified two-stage defense—fine-tuning with an augmentation policy and inference-time preprocessing with a separate policy. Through evaluation on a diverse Attack Database with 71 augmentations, the approach mitigates eight attack types and outperforms five established defenses, while maintaining practical model usability. The framework is extensible and open source, aiming to benchmark and accelerate future research in DNN backdoor robustness.

Abstract

Public resources and services (e.g., datasets, training platforms, pre-trained models) have been widely adopted to ease the development of Deep Learning-based applications. However, if the third-party providers are untrusted, they can inject poisoned samples into the datasets or embed backdoors in those models. Such an integrity breach can cause severe consequences, especially in safety- and security-critical applications. Various backdoor attack techniques have been proposed for higher effectiveness and stealthiness. Unfortunately, existing defense solutions are not practical to thwart those attacks in a comprehensive way. In this paper, we investigate the effectiveness of data augmentation techniques in mitigating backdoor attacks and enhancing DL models' robustness. An evaluation framework is introduced to achieve this goal. Specifically, we consider a unified defense solution, which (1) adopts a data augmentation policy to fine-tune the infected model and eliminate the effects of the embedded backdoor; (2) uses another augmentation policy to preprocess input samples and invalidate the triggers during inference. We propose a systematic approach to discover the optimal policies for defending against different backdoor attacks by comprehensively evaluating 71 state-of-the-art data augmentation functions. Extensive experiments show that our identified policy can effectively mitigate eight different kinds of backdoor attacks and outperform five existing defense methods. We envision this framework can be a good benchmark tool to advance future DNN backdoor studies.

DeepSweep: An Evaluation Framework for Mitigating DNN Backdoor Attacks using Data Augmentation

TL;DR

Backdoor attacks on DNNs arising from untrusted data and models pose critical security risks. The authors propose DeepSweep, a data-augmentation–driven framework that automatically discovers a unified two-stage defense—fine-tuning with an augmentation policy and inference-time preprocessing with a separate policy. Through evaluation on a diverse Attack Database with 71 augmentations, the approach mitigates eight attack types and outperforms five established defenses, while maintaining practical model usability. The framework is extensible and open source, aiming to benchmark and accelerate future research in DNN backdoor robustness.

Abstract

Public resources and services (e.g., datasets, training platforms, pre-trained models) have been widely adopted to ease the development of Deep Learning-based applications. However, if the third-party providers are untrusted, they can inject poisoned samples into the datasets or embed backdoors in those models. Such an integrity breach can cause severe consequences, especially in safety- and security-critical applications. Various backdoor attack techniques have been proposed for higher effectiveness and stealthiness. Unfortunately, existing defense solutions are not practical to thwart those attacks in a comprehensive way. In this paper, we investigate the effectiveness of data augmentation techniques in mitigating backdoor attacks and enhancing DL models' robustness. An evaluation framework is introduced to achieve this goal. Specifically, we consider a unified defense solution, which (1) adopts a data augmentation policy to fine-tune the infected model and eliminate the effects of the embedded backdoor; (2) uses another augmentation policy to preprocess input samples and invalidate the triggers during inference. We propose a systematic approach to discover the optimal policies for defending against different backdoor attacks by comprehensively evaluating 71 state-of-the-art data augmentation functions. Extensive experiments show that our identified policy can effectively mitigate eight different kinds of backdoor attacks and outperform five existing defense methods. We envision this framework can be a good benchmark tool to advance future DNN backdoor studies.

Paper Structure

This paper contains 34 sections, 1 equation, 8 figures, 9 tables, 5 algorithms.

Figures (8)

  • Figure 1: Framework overview of DeepSweep.
  • Figure 2: Trigger-patched samples in various backdoor attacks in DeepSweep.
  • Figure 3: Visual results of the transformed images with different augmentation functions individually.
  • Figure 4: Fine-tuning Transformation Policy includes six functions: three affine transformations (OD, RSPA, and SAT) and the three median filters (GCSM, GESM, and DSSM).
  • Figure 5: Inference Transformation Policy consists of three functions: two median filters affect the triggers from two spaces; SAT helps distort the image. The first row is for a clean image, and the second row is for a patched image.
  • ...and 3 more figures