Table of Contents
Fetching ...

Synthesizing Physical Backdoor Datasets: An Automated Framework Leveraging Deep Generative Models

Sze Jue Yang, Chinh D. La, Quang H. Nguyen, Kok-Seng Wong, Anh Tuan Tran, Chee Seng Chan, Khoa D. Doan

TL;DR

This work addresses the practical barrier of creating physical backdoor datasets by introducing an automated recipe that combinesTrigger Suggestion (VQA-based trigger reasoning), Trigger Generation (text-guided editing or text-to-image generation), and Poison Selection (ImageReward-based ranking). By leveraging deep generative models and a human-aligned evaluation metric, the framework can produce realistic poisoned data with physical triggers, enabling robust evaluation of backdoor threats in the real world. Experimental results on ImageNet-5 and real devices demonstrate competitive attack performance (Real CA/Real ASR) and reveal insights into the resilience of defenses under different synthesis paths, along with Grad-CAM evidence of trigger focus. The proposed toolkit offers researchers a practical pathway to study physical backdoors under laboratory constraints, highlighting both the potential harms and the need for stronger defenses in real-world systems.

Abstract

Backdoor attacks, representing an emerging threat to the integrity of deep neural networks, have garnered significant attention due to their ability to compromise deep learning systems clandestinely. While numerous backdoor attacks occur within the digital realm, their practical implementation in real-world prediction systems remains limited and vulnerable to disturbances in the physical world. Consequently, this limitation has given rise to the development of physical backdoor attacks, where trigger objects manifest as physical entities within the real world. However, creating the requisite dataset to train or evaluate a physical backdoor model is a daunting task, limiting the backdoor researchers and practitioners from studying such physical attack scenarios. This paper unleashes a recipe that empowers backdoor researchers to effortlessly create a malicious, physical backdoor dataset based on advances in generative modeling. Particularly, this recipe involves 3 automatic modules: suggesting the suitable physical triggers, generating the poisoned candidate samples (either by synthesizing new samples or editing existing clean samples), and finally refining for the most plausible ones. As such, it effectively mitigates the perceived complexity associated with creating a physical backdoor dataset, transforming it from a daunting task into an attainable objective. Extensive experiment results show that datasets created by our "recipe" enable adversaries to achieve an impressive attack success rate on real physical world data and exhibit similar properties compared to previous physical backdoor attack studies. This paper offers researchers a valuable toolkit for studies of physical backdoors, all within the confines of their laboratories.

Synthesizing Physical Backdoor Datasets: An Automated Framework Leveraging Deep Generative Models

TL;DR

This work addresses the practical barrier of creating physical backdoor datasets by introducing an automated recipe that combinesTrigger Suggestion (VQA-based trigger reasoning), Trigger Generation (text-guided editing or text-to-image generation), and Poison Selection (ImageReward-based ranking). By leveraging deep generative models and a human-aligned evaluation metric, the framework can produce realistic poisoned data with physical triggers, enabling robust evaluation of backdoor threats in the real world. Experimental results on ImageNet-5 and real devices demonstrate competitive attack performance (Real CA/Real ASR) and reveal insights into the resilience of defenses under different synthesis paths, along with Grad-CAM evidence of trigger focus. The proposed toolkit offers researchers a practical pathway to study physical backdoors under laboratory constraints, highlighting both the potential harms and the need for stronger defenses in real-world systems.

Abstract

Backdoor attacks, representing an emerging threat to the integrity of deep neural networks, have garnered significant attention due to their ability to compromise deep learning systems clandestinely. While numerous backdoor attacks occur within the digital realm, their practical implementation in real-world prediction systems remains limited and vulnerable to disturbances in the physical world. Consequently, this limitation has given rise to the development of physical backdoor attacks, where trigger objects manifest as physical entities within the real world. However, creating the requisite dataset to train or evaluate a physical backdoor model is a daunting task, limiting the backdoor researchers and practitioners from studying such physical attack scenarios. This paper unleashes a recipe that empowers backdoor researchers to effortlessly create a malicious, physical backdoor dataset based on advances in generative modeling. Particularly, this recipe involves 3 automatic modules: suggesting the suitable physical triggers, generating the poisoned candidate samples (either by synthesizing new samples or editing existing clean samples), and finally refining for the most plausible ones. As such, it effectively mitigates the perceived complexity associated with creating a physical backdoor dataset, transforming it from a daunting task into an attainable objective. Extensive experiment results show that datasets created by our "recipe" enable adversaries to achieve an impressive attack success rate on real physical world data and exhibit similar properties compared to previous physical backdoor attack studies. This paper offers researchers a valuable toolkit for studies of physical backdoors, all within the confines of their laboratories.
Paper Structure (27 sections, 14 figures, 5 tables)

This paper contains 27 sections, 14 figures, 5 tables.

Figures (14)

  • Figure 1: Images edited/generated by our framework with the trigger = "tennis ball".
  • Figure 2: Overview of our proposed framework that consists of three different modules: (i) Trigger Suggestion, (ii) Trigger Generation and (iii) Poison Selection to ease in crafting a physical backdoor dataset.
  • Figure 3: Results from the trigger suggestion module. "Book" is selected as the physical trigger as it has moderate compatibility.
  • Figure 4: Neural Cleanse. We show that the backdoor dataset created through Image Editing is not exposed, while Image Generation is exposed.
  • Figure 5: STRIP. Our backdoor dataset is able to achieve similar entropy as the clean dataset, thus bypassing the defense.
  • ...and 9 more figures