Table of Contents
Fetching ...

BELT: Old-School Backdoor Attacks can Evade the State-of-the-Art Defense with Backdoor Exclusivity Lifting

Huming Qiu, Junjie Sun, Mi Zhang, Xudong Pan, Min Yang

TL;DR

This work introduces the concept of backdoor exclusivity and a metric Excl to quantify how uniquely a backdoor responds to precise triggers versus fuzzy, unintended triggers. It then proposes BELT, a Backdoor Exclusivity LifTing method that uses dirty and cover poisoned samples (with masking-based fuzzy triggers) and, in model outsourcing scenarios, Momentum Center Loss to tighten the clustering of original triggers and suppress fuzzy triggers. Evaluations on CIFAR-10, GTSRB, and TinyImageNet across four classic attacks demonstrate that BELT substantially increases exclusivity (often to the 80s percentile) while maintaining attack success and clean accuracy, enabling evasion of seven state-of-the-art defenses. The results highlight a pressing security concern for third-party model deployment and suggest that defenses must account for highly exclusive backdoors that resist standard trigger-reverse-engineering and data/feature-based detection approaches.

Abstract

Deep neural networks (DNNs) are susceptible to backdoor attacks, where malicious functionality is embedded to allow attackers to trigger incorrect classifications. Old-school backdoor attacks use strong trigger features that can easily be learned by victim models. Despite robustness against input variation, the robustness however increases the likelihood of unintentional trigger activations. This leaves traces to existing defenses, which find approximate replacements for the original triggers that can activate the backdoor without being identical to the original trigger via, e.g., reverse engineering and sample overlay. In this paper, we propose and investigate a new characteristic of backdoor attacks, namely, backdoor exclusivity, which measures the ability of backdoor triggers to remain effective in the presence of input variation. Building upon the concept of backdoor exclusivity, we propose Backdoor Exclusivity LifTing (BELT), a novel technique which suppresses the association between the backdoor and fuzzy triggers to enhance backdoor exclusivity for defense evasion. Extensive evaluation on three popular backdoor benchmarks validate, our approach substantially enhances the stealthiness of four old-school backdoor attacks, which, after backdoor exclusivity lifting, is able to evade seven state-of-the-art backdoor countermeasures, at almost no cost of the attack success rate and normal utility. For example, one of the earliest backdoor attacks BadNet, enhanced by BELT, evades most of the state-of-the-art defenses including ABS and MOTH which would otherwise recognize the backdoored model.

BELT: Old-School Backdoor Attacks can Evade the State-of-the-Art Defense with Backdoor Exclusivity Lifting

TL;DR

This work introduces the concept of backdoor exclusivity and a metric Excl to quantify how uniquely a backdoor responds to precise triggers versus fuzzy, unintended triggers. It then proposes BELT, a Backdoor Exclusivity LifTing method that uses dirty and cover poisoned samples (with masking-based fuzzy triggers) and, in model outsourcing scenarios, Momentum Center Loss to tighten the clustering of original triggers and suppress fuzzy triggers. Evaluations on CIFAR-10, GTSRB, and TinyImageNet across four classic attacks demonstrate that BELT substantially increases exclusivity (often to the 80s percentile) while maintaining attack success and clean accuracy, enabling evasion of seven state-of-the-art defenses. The results highlight a pressing security concern for third-party model deployment and suggest that defenses must account for highly exclusive backdoors that resist standard trigger-reverse-engineering and data/feature-based detection approaches.

Abstract

Deep neural networks (DNNs) are susceptible to backdoor attacks, where malicious functionality is embedded to allow attackers to trigger incorrect classifications. Old-school backdoor attacks use strong trigger features that can easily be learned by victim models. Despite robustness against input variation, the robustness however increases the likelihood of unintentional trigger activations. This leaves traces to existing defenses, which find approximate replacements for the original triggers that can activate the backdoor without being identical to the original trigger via, e.g., reverse engineering and sample overlay. In this paper, we propose and investigate a new characteristic of backdoor attacks, namely, backdoor exclusivity, which measures the ability of backdoor triggers to remain effective in the presence of input variation. Building upon the concept of backdoor exclusivity, we propose Backdoor Exclusivity LifTing (BELT), a novel technique which suppresses the association between the backdoor and fuzzy triggers to enhance backdoor exclusivity for defense evasion. Extensive evaluation on three popular backdoor benchmarks validate, our approach substantially enhances the stealthiness of four old-school backdoor attacks, which, after backdoor exclusivity lifting, is able to evade seven state-of-the-art backdoor countermeasures, at almost no cost of the attack success rate and normal utility. For example, one of the earliest backdoor attacks BadNet, enhanced by BELT, evades most of the state-of-the-art defenses including ABS and MOTH which would otherwise recognize the backdoored model.
Paper Structure (31 sections, 11 equations, 13 figures, 5 tables, 1 algorithm)

This paper contains 31 sections, 11 equations, 13 figures, 5 tables, 1 algorithm.

Figures (13)

  • Figure 1: Examples of fuzzy triggers employed by backdoor countermeasures on classical backdoor attacks.
  • Figure 2: Schematic diagram of backdoor exclusivity.
  • Figure 3: Demonstrations of cover triggers.
  • Figure 4: Overview of BELT in data outsourcing and model outsourcing scenarios.
  • Figure 5: Performance of Neural Cleanse on backdoored models before and after BELT.
  • ...and 8 more figures

Theorems & Definitions (3)

  • Definition 1
  • Definition 2
  • Definition 3