Table of Contents
Fetching ...

BackdoorBench: A Comprehensive Benchmark of Backdoor Learning

Baoyuan Wu, Hongrui Chen, Mingda Zhang, Zihao Zhu, Shaokui Wei, Danni Yuan, Chao Shen

TL;DR

BackdoorBench addresses the fragmented evaluation landscape of backdoor learning by providing an extensible, modular benchmark with implementations of 8 backdoor attacks and 9 defenses, evaluated across 5 poisoning ratios, 5 models, and 4 datasets (yielding up to 8,000 attack-defense pairs). It couples a standardized protocol with 5 analysis tools (e.g., t-SNE, Grad-CAM, Shapley maps, FSM, neuron activation) to enable fair comparison and deeper insights into how attacks, defenses, poisoning ratios, and model architectures interact. The study uncovers nuanced behaviors such as non-monotonic ASR responses at high poisoning ratios, architecture-dependent effectiveness, and trigger generalization phenomena, offering new directions for robust defense design and future benchmark expansion. Overall, BackdoorBench provides a practical platform to accelerate reproducible progress in backdoor learning and underpins methodological and empirical understanding crucial for developing resilient models.

Abstract

Backdoor learning is an emerging and vital topic for studying deep neural networks' vulnerability (DNNs). Many pioneering backdoor attack and defense methods are being proposed, successively or concurrently, in the status of a rapid arms race. However, we find that the evaluations of new methods are often unthorough to verify their claims and accurate performance, mainly due to the rapid development, diverse settings, and the difficulties of implementation and reproducibility. Without thorough evaluations and comparisons, it is not easy to track the current progress and design the future development roadmap of the literature. To alleviate this dilemma, we build a comprehensive benchmark of backdoor learning called BackdoorBench. It consists of an extensible modular-based codebase (currently including implementations of 8 state-of-the-art (SOTA) attacks and 9 SOTA defense algorithms) and a standardized protocol of complete backdoor learning. We also provide comprehensive evaluations of every pair of 8 attacks against 9 defenses, with 5 poisoning ratios, based on 5 models and 4 datasets, thus 8,000 pairs of evaluations in total. We present abundant analysis from different perspectives about these 8,000 evaluations, studying the effects of different factors in backdoor learning. All codes and evaluations of BackdoorBench are publicly available at \url{https://backdoorbench.github.io}.

BackdoorBench: A Comprehensive Benchmark of Backdoor Learning

TL;DR

BackdoorBench addresses the fragmented evaluation landscape of backdoor learning by providing an extensible, modular benchmark with implementations of 8 backdoor attacks and 9 defenses, evaluated across 5 poisoning ratios, 5 models, and 4 datasets (yielding up to 8,000 attack-defense pairs). It couples a standardized protocol with 5 analysis tools (e.g., t-SNE, Grad-CAM, Shapley maps, FSM, neuron activation) to enable fair comparison and deeper insights into how attacks, defenses, poisoning ratios, and model architectures interact. The study uncovers nuanced behaviors such as non-monotonic ASR responses at high poisoning ratios, architecture-dependent effectiveness, and trigger generalization phenomena, offering new directions for robust defense design and future benchmark expansion. Overall, BackdoorBench provides a practical platform to accelerate reproducible progress in backdoor learning and underpins methodological and empirical understanding crucial for developing resilient models.

Abstract

Backdoor learning is an emerging and vital topic for studying deep neural networks' vulnerability (DNNs). Many pioneering backdoor attack and defense methods are being proposed, successively or concurrently, in the status of a rapid arms race. However, we find that the evaluations of new methods are often unthorough to verify their claims and accurate performance, mainly due to the rapid development, diverse settings, and the difficulties of implementation and reproducibility. Without thorough evaluations and comparisons, it is not easy to track the current progress and design the future development roadmap of the literature. To alleviate this dilemma, we build a comprehensive benchmark of backdoor learning called BackdoorBench. It consists of an extensible modular-based codebase (currently including implementations of 8 state-of-the-art (SOTA) attacks and 9 SOTA defense algorithms) and a standardized protocol of complete backdoor learning. We also provide comprehensive evaluations of every pair of 8 attacks against 9 defenses, with 5 poisoning ratios, based on 5 models and 4 datasets, thus 8,000 pairs of evaluations in total. We present abundant analysis from different perspectives about these 8,000 evaluations, studying the effects of different factors in backdoor learning. All codes and evaluations of BackdoorBench are publicly available at \url{https://backdoorbench.github.io}.
Paper Structure (41 sections, 2 equations, 20 figures, 17 tables)

This paper contains 41 sections, 2 equations, 20 figures, 17 tables.

Figures (20)

  • Figure 1: The general structure of the modular based codebase of BackdoorBench.
  • Figure 2: Performance distribution of different attack-defense pairs. Each color pattern represents one attack-defense pair, with attacks distinguished by patterns, while defenses by colors.
  • Figure 3: The effects of different poisoning ratios on backdoor learning.
  • Figure 4: The changes of neuron activation values due to the FT defense (Top row), and the changes of t-SNE visualization of feature representations due to the ABL defense (Middle row) and the ANP defense (Bottom row), respectively.
  • Figure 5: The effects of different model architectures using different defense and attack methods.
  • ...and 15 more figures