Table of Contents
Fetching ...

LADDER: Multi-objective Backdoor Attack via Evolutionary Algorithm

Dazhuang Liu, Yanqi Qiao, Rui Wang, Kaitai Liang, Georgios Smaragdakis

TL;DR

LADDER reframes backdoor trigger design in black-box CNNs as a multi-objective optimization problem over $O_1$ (attack effectiveness), $O_2$ (spectral perturbation magnitude), and $O_3$ (dual-domain spectral robustness). It employs a gradient-free MOEA (NSGA-II style) with non-dominated sorting and a preference-based selection to evolve triggers in the low-frequency spectral domain, achieving simultaneous stealthiness in both spatial and spectral domains and robustness to preprocessing. By using a heterogeneous surrogate model for evaluation, LADDER avoids gradients from the victim model yet maintains high ASR ($>99\\%$) with minimal ACC loss and strong defense resilience across five datasets. The work provides a concrete, scalable approach to multi-objective trigger design and highlights the need for defenses that monitor spectral-domain anomalies to mitigate dual-domain backdoors.

Abstract

Current black-box backdoor attacks in convolutional neural networks formulate attack objective(s) as single-objective optimization problems in single domain. Designing triggers in single domain harms semantics and trigger robustness as well as introduces visual and spectral anomaly. This work proposes a multi-objective black-box backdoor attack in dual domains via evolutionary algorithm (LADDER), the first instance of achieving multiple attack objectives simultaneously by optimizing triggers without requiring prior knowledge about victim model. In particular, we formulate LADDER as a multi-objective optimization problem (MOP) and solve it via multi-objective evolutionary algorithm (MOEA). MOEA maintains a population of triggers with trade-offs among attack objectives and uses non-dominated sort to drive triggers toward optimal solutions. We further apply preference-based selection to MOEA to exclude impractical triggers. We state that LADDER investigates a new dual-domain perspective for trigger stealthiness by minimizing the anomaly between clean and poisoned samples in the spectral domain. Lastly, the robustness against preprocessing operations is achieved by pushing triggers to low-frequency regions. Extensive experiments comprehensively showcase that LADDER achieves attack effectiveness of at least 99%, attack robustness with 90.23% (50.09% higher than state-of-the-art attacks on average), superior natural stealthiness (1.12x to 196.74x improvement) and excellent spectral stealthiness (8.45x enhancement) as compared to current stealthy attacks by the average $l_2$-norm across 5 public datasets.

LADDER: Multi-objective Backdoor Attack via Evolutionary Algorithm

TL;DR

LADDER reframes backdoor trigger design in black-box CNNs as a multi-objective optimization problem over (attack effectiveness), (spectral perturbation magnitude), and (dual-domain spectral robustness). It employs a gradient-free MOEA (NSGA-II style) with non-dominated sorting and a preference-based selection to evolve triggers in the low-frequency spectral domain, achieving simultaneous stealthiness in both spatial and spectral domains and robustness to preprocessing. By using a heterogeneous surrogate model for evaluation, LADDER avoids gradients from the victim model yet maintains high ASR () with minimal ACC loss and strong defense resilience across five datasets. The work provides a concrete, scalable approach to multi-objective trigger design and highlights the need for defenses that monitor spectral-domain anomalies to mitigate dual-domain backdoors.

Abstract

Current black-box backdoor attacks in convolutional neural networks formulate attack objective(s) as single-objective optimization problems in single domain. Designing triggers in single domain harms semantics and trigger robustness as well as introduces visual and spectral anomaly. This work proposes a multi-objective black-box backdoor attack in dual domains via evolutionary algorithm (LADDER), the first instance of achieving multiple attack objectives simultaneously by optimizing triggers without requiring prior knowledge about victim model. In particular, we formulate LADDER as a multi-objective optimization problem (MOP) and solve it via multi-objective evolutionary algorithm (MOEA). MOEA maintains a population of triggers with trade-offs among attack objectives and uses non-dominated sort to drive triggers toward optimal solutions. We further apply preference-based selection to MOEA to exclude impractical triggers. We state that LADDER investigates a new dual-domain perspective for trigger stealthiness by minimizing the anomaly between clean and poisoned samples in the spectral domain. Lastly, the robustness against preprocessing operations is achieved by pushing triggers to low-frequency regions. Extensive experiments comprehensively showcase that LADDER achieves attack effectiveness of at least 99%, attack robustness with 90.23% (50.09% higher than state-of-the-art attacks on average), superior natural stealthiness (1.12x to 196.74x improvement) and excellent spectral stealthiness (8.45x enhancement) as compared to current stealthy attacks by the average -norm across 5 public datasets.

Paper Structure

This paper contains 33 sections, 1 theorem, 9 equations, 17 figures, 19 tables, 3 algorithms.

Key Result

lemma 1

A frequency signal X=$\left\{X^{0},X^{1}, \cdots, X^{N-1}\right\}$ and the corresponding spatial signal $\emph{x}=\left\{x^{0},x^{1}, \cdots, x^{N-1}\right\}$, where $x=\mathcal{D}^{-1}(X)$ is obtained with type-II IDCT, has the same inner product, i.e., $\left\langle\ X,X \right\rangle \equiv \left

Figures (17)

  • Figure 1: The workflow of LADDER. Step ①-③: trigger injection; Step ④-⑥: main loop for trigger optimization; Step ⑦-⑧: poison dataset with trigger and release it to public; Step ⑨: the backdoor is injected when users download the poisoned data to train/tune their own model. The trigger optimization, evaluation, and injection are controlled by an attacker, whereas the malicious training and inference stage (marked in grey) are unseen to the attacker.
  • Figure 2: The impact of Lagrange coefficient $\alpha$ in backdoor attack formulated with Lagrange multipliers and solved by SGD concerning trigger perceptibility and attack failure rate.
  • Figure 3: Explanation of objective conflicting in backdoor attack, where red and blue dots represent the triggers obtained by LADDER and SGD in victim model. The grey region indicates the objective value of triggers that we prefer to achieve. In this case we reflect our preference by ASR$\leftarrow$0.9 and $l_{2}\leftarrow$0.4.
  • Figure 4: The workflow of (a): Patching a trigger $t$=$(\delta,\nu)$ into the spectrum of each channel of an RGB image. $\mathcal{D}$ denotes the DCT function in \ref{['eq:DCT']}. $C_R$, $C_G$ and $C_B$ denotes the R, G and B channel. (b): Optimizing trigger via MOEA. Exp($\cdot$) denotes sampling from an exponential distribution.
  • Figure 5: Comparison of triggers on MOEA with/without preference-based selection in CIFAR-10 on VGG11. Compared to NDSort, rNDSort pulls LADDER triggers closer to the attacker-desired region.
  • ...and 12 more figures

Theorems & Definitions (1)

  • lemma 1