BackdoorBench: A Comprehensive Benchmark of Backdoor Learning
Baoyuan Wu, Hongrui Chen, Mingda Zhang, Zihao Zhu, Shaokui Wei, Danni Yuan, Chao Shen
TL;DR
BackdoorBench addresses the fragmented evaluation landscape of backdoor learning by providing an extensible, modular benchmark with implementations of 8 backdoor attacks and 9 defenses, evaluated across 5 poisoning ratios, 5 models, and 4 datasets (yielding up to 8,000 attack-defense pairs). It couples a standardized protocol with 5 analysis tools (e.g., t-SNE, Grad-CAM, Shapley maps, FSM, neuron activation) to enable fair comparison and deeper insights into how attacks, defenses, poisoning ratios, and model architectures interact. The study uncovers nuanced behaviors such as non-monotonic ASR responses at high poisoning ratios, architecture-dependent effectiveness, and trigger generalization phenomena, offering new directions for robust defense design and future benchmark expansion. Overall, BackdoorBench provides a practical platform to accelerate reproducible progress in backdoor learning and underpins methodological and empirical understanding crucial for developing resilient models.
Abstract
Backdoor learning is an emerging and vital topic for studying deep neural networks' vulnerability (DNNs). Many pioneering backdoor attack and defense methods are being proposed, successively or concurrently, in the status of a rapid arms race. However, we find that the evaluations of new methods are often unthorough to verify their claims and accurate performance, mainly due to the rapid development, diverse settings, and the difficulties of implementation and reproducibility. Without thorough evaluations and comparisons, it is not easy to track the current progress and design the future development roadmap of the literature. To alleviate this dilemma, we build a comprehensive benchmark of backdoor learning called BackdoorBench. It consists of an extensible modular-based codebase (currently including implementations of 8 state-of-the-art (SOTA) attacks and 9 SOTA defense algorithms) and a standardized protocol of complete backdoor learning. We also provide comprehensive evaluations of every pair of 8 attacks against 9 defenses, with 5 poisoning ratios, based on 5 models and 4 datasets, thus 8,000 pairs of evaluations in total. We present abundant analysis from different perspectives about these 8,000 evaluations, studying the effects of different factors in backdoor learning. All codes and evaluations of BackdoorBench are publicly available at \url{https://backdoorbench.github.io}.
