Versatile Backdoor Attack with Visible, Semantic, Sample-Specific, and Compatible Triggers

Ruotong Wang; Hongrui Chen; Zihao Zhu; Li Liu; Baoyuan Wu

Versatile Backdoor Attack with Visible, Semantic, Sample-Specific, and Compatible Triggers

Ruotong Wang, Hongrui Chen, Zihao Zhu, Li Liu, Baoyuan Wu

TL;DR

By providing the first automated pipeline, VSSC transforms physical backdoor attacks from a labor-intensive craft into a systematic and realistic threat to real-world AI systems and could inspire future studies on designing more practical triggers in backdoor attacks.

Abstract

Deep neural networks (DNNs) can be manipulated to exhibit specific behaviors when exposed to specific trigger patterns, without affecting their performance on benign samples, dubbed \textit{backdoor attack}. Currently, implementing backdoor attacks in physical scenarios still faces significant challenges. Physical attacks are labor-intensive and time-consuming, and the triggers are selected in a manual and heuristic way. Moreover, expanding digital attacks to physical scenarios faces many challenges due to their sensitivity to visual distortions and the absence of counterparts in the real world. To address these challenges, we define a novel trigger called the \textbf{V}isible, \textbf{S}emantic, \textbf{S}ample-Specific, and \textbf{C}ompatible (VSSC) trigger, to achieve effective, stealthy and robust simultaneously, which can also be effectively deployed in the physical scenario using corresponding objects. To implement the VSSC trigger, we propose an automated pipeline comprising three modules: a trigger selection module that systematically identifies suitable triggers leveraging large language models, a trigger insertion module that employs generative models to seamlessly integrate triggers into images, and a quality assessment module that ensures the natural and successful insertion of triggers through vision-language models. Extensive experimental results and analysis validate the effectiveness, stealthiness, and robustness of the VSSC trigger. It can not only maintain robustness under visual distortions but also demonstrates strong practicality in the physical scenario. We hope that the proposed VSSC trigger and implementation approach could inspire future studies on designing more practical triggers in backdoor attacks.

Versatile Backdoor Attack with Visible, Semantic, Sample-Specific, and Compatible Triggers

TL;DR

Abstract

Paper Structure (79 sections, 21 figures, 18 tables)

This paper contains 79 sections, 21 figures, 18 tables.

Introduction
Related work
Backdoor Attacks
Backdoor attacks on image classification
Backdoor attacks on object detection
Backdoor attacks on face verification
Backdoor Defenses
Semantic Triggers in Backdoor Attacks
Investigation of the characteristics of backdoor triggers
Methodology
Problem Formulation
Threat model
Notations
Backdoor Attack with Visible, Semantic, Sample-specific, and Compatible (VSSC) Trigger
Trigger selection module
...and 64 more sections

Figures (21)

Figure 1: Comparison of poisoned samples and prediction results of digital backdoor attacks under diverse environmental conditions: without distortion, with digital distortion (Gaussian blur) and physical distortion (print and recapture). For VSSC, given its capability to extend into the physical scenario, an additional demonstration using real objects as triggers is provided.
Figure 2: Connections between desired goals of an ideal backdoor attack and characteristics of our proposed trigger. A solid line indicates that the characteristic directly contributes to the goal, while a dashed line signifies a collaborative effort. Please refer to Section \ref{['sec: investigating trigger characteristics']} for the detailed illustration of these connections.
Figure 3: Overview of our proposed method. The process of generating poisoned samples includes three fundamental modules. To synthesize a poisoned dataset, a text trigger is first selected using the trigger selection module, then inserted into a benign image using the trigger insertion module, and finally evaluated using the quality assessment module to assess insertion effects. Only poisoned images meeting quality criteria are used to form the poisoned dataset, while low-quality images need to be regenerated.
Figure 4: Poisoned samples generated by different attacks for the image classification task. BadNets uses a black-and-white grid as the trigger pattern, while Blended uses an image. For VSSC attack, we demonstrate two semantic triggers used in our experiments for each dataset. Although all these poisoned images are successfully classified as the target label, VSSC triggers in them are the most stealthy and compatible.
Figure 5: Examples of manually captured photos in the image classification task.
...and 16 more figures

Versatile Backdoor Attack with Visible, Semantic, Sample-Specific, and Compatible Triggers

TL;DR

Abstract

Versatile Backdoor Attack with Visible, Semantic, Sample-Specific, and Compatible Triggers

Authors

TL;DR

Abstract

Table of Contents

Figures (21)