Table of Contents
Fetching ...

BackdoorMBTI: A Backdoor Learning Multimodal Benchmark Tool Kit for Backdoor Defense Evaluation

Haiyang Yu, Tian Xie, Jiaping Gui, Pengyang Wang, Ping Yi, Yue Wu

TL;DR

BackdoorMBTI addresses the lack of multimodal backdoor benchmarks by delivering a unified toolkit and benchmark across image, text, and audio. It integrates data processing, data poisoning, backdoor training, and evaluation, and includes a noise generator to simulate real-world conditions. The framework encompasses 11 datasets, 17 attacks, and 7 defenses, enabling cross-modality migration analyses and reproducible comparisons with poisoned datasets and models provided openly. By standardizing evaluation and incorporating realistic noise factors, BackdoorMBTI accelerates the development and rigorous assessment of multimodal backdoor defenses and offers a practical platform for future research.

Abstract

Over the past few years, the emergence of backdoor attacks has presented significant challenges to deep learning systems, allowing attackers to insert backdoors into neural networks. When data with a trigger is processed by a backdoor model, it can lead to mispredictions targeted by attackers, whereas normal data yields regular results. The scope of backdoor attacks is expanding beyond computer vision and encroaching into areas such as natural language processing and speech recognition. Nevertheless, existing backdoor defense methods are typically tailored to specific data modalities, restricting their application in multimodal contexts. While multimodal learning proves highly applicable in facial recognition, sentiment analysis, action recognition, visual question answering, the security of these models remains a crucial concern. Specifically, there are no existing backdoor benchmarks targeting multimodal applications or related tasks. In order to facilitate the research in multimodal backdoor, we introduce BackdoorMBTI, the first backdoor learning toolkit and benchmark designed for multimodal evaluation across three representative modalities from eleven commonly used datasets. BackdoorMBTI provides a systematic backdoor learning pipeline, encompassing data processing, data poisoning, backdoor training, and evaluation. The generated poison datasets and backdoor models enable detailed evaluation of backdoor defenses. Given the diversity of modalities, BackdoorMBTI facilitates systematic evaluation across different data types. Furthermore, BackdoorMBTI offers a standardized approach to handling practical factors in backdoor learning, such as issues related to data quality and erroneous labels. We anticipate that BackdoorMBTI will expedite future research in backdoor defense methods within a multimodal context. Code is available at https://github.com/SJTUHaiyangYu/BackdoorMBTI.

BackdoorMBTI: A Backdoor Learning Multimodal Benchmark Tool Kit for Backdoor Defense Evaluation

TL;DR

BackdoorMBTI addresses the lack of multimodal backdoor benchmarks by delivering a unified toolkit and benchmark across image, text, and audio. It integrates data processing, data poisoning, backdoor training, and evaluation, and includes a noise generator to simulate real-world conditions. The framework encompasses 11 datasets, 17 attacks, and 7 defenses, enabling cross-modality migration analyses and reproducible comparisons with poisoned datasets and models provided openly. By standardizing evaluation and incorporating realistic noise factors, BackdoorMBTI accelerates the development and rigorous assessment of multimodal backdoor defenses and offers a practical platform for future research.

Abstract

Over the past few years, the emergence of backdoor attacks has presented significant challenges to deep learning systems, allowing attackers to insert backdoors into neural networks. When data with a trigger is processed by a backdoor model, it can lead to mispredictions targeted by attackers, whereas normal data yields regular results. The scope of backdoor attacks is expanding beyond computer vision and encroaching into areas such as natural language processing and speech recognition. Nevertheless, existing backdoor defense methods are typically tailored to specific data modalities, restricting their application in multimodal contexts. While multimodal learning proves highly applicable in facial recognition, sentiment analysis, action recognition, visual question answering, the security of these models remains a crucial concern. Specifically, there are no existing backdoor benchmarks targeting multimodal applications or related tasks. In order to facilitate the research in multimodal backdoor, we introduce BackdoorMBTI, the first backdoor learning toolkit and benchmark designed for multimodal evaluation across three representative modalities from eleven commonly used datasets. BackdoorMBTI provides a systematic backdoor learning pipeline, encompassing data processing, data poisoning, backdoor training, and evaluation. The generated poison datasets and backdoor models enable detailed evaluation of backdoor defenses. Given the diversity of modalities, BackdoorMBTI facilitates systematic evaluation across different data types. Furthermore, BackdoorMBTI offers a standardized approach to handling practical factors in backdoor learning, such as issues related to data quality and erroneous labels. We anticipate that BackdoorMBTI will expedite future research in backdoor defense methods within a multimodal context. Code is available at https://github.com/SJTUHaiyangYu/BackdoorMBTI.

Paper Structure

This paper contains 30 sections, 4 figures, 10 tables.

Figures (4)

  • Figure 1: The architecture overview of BackdoorMBTI.
  • Figure 2: The accuracy comparison of various attack-defense pairs. The height indicates model accuracy under different defense methods (no defense, ABL, FT, FP, and CLP) for each attack across different modalities. Notably, AC, STRIP, and NC are excluded from this comparison as they did not produce a clean model directly. The asterisk denotes a value smaller than 0.001. Gaussian noise (mean 0, variance 1) and text noise (level 0.1) are used in the experiment.
  • Figure 3: The ASR comparison of various attack-defense pairs. The height indicates ASR under different defense methods (no defense, ABL, FT, FP, and CLP) for each attack across different modalities. Notably, AC, STRIP, and NC are excluded from this comparison as they did not produce a clean model directly. The asterisk denotes a value smaller than 0.001. Gaussian noise (mean 0, variance 1) and text noise (level 0.1) are used in the experiment.
  • Figure 4: The accuracy and ASR comparison of backdoor defenses. Effective methods are typically positioned in the top-left corner, indicating high accuracy and low ASR on sanitized models.