Table of Contents
Fetching ...

Backdoor Contrastive Learning via Bi-level Trigger Optimization

Weiyu Sun, Xinyu Zhang, Hao Lu, Yingcong Chen, Ting Wang, Jinghui Chen, Lu Lin

TL;DR

Contrastive learning pretraining is vulnerable to backdoor triggers, but fixed triggers fail to reliably bind backdoored inputs to the target class under CL dynamics. The authors propose Bi-Level Trigger Optimization (BLTO), which uses an inner surrogate CL procedure and an outer trigger-optimizer to produce triggers that remain close to the target cluster in embedding space during training. BLTO achieves high attack success rates at a low poisoning rate (e.g., $1\%$) across CIFAR-10/100 and ImageNet-100, with transferability across CL strategies and resilience to several defenses, while offering analysis of alignment and uniformity interactions. These results reveal a practical security risk in CL pipelines and motivate development of defenses tailored to SSL/CL systems.

Abstract

Contrastive Learning (CL) has attracted enormous attention due to its remarkable capability in unsupervised representation learning. However, recent works have revealed the vulnerability of CL to backdoor attacks: the feature extractor could be misled to embed backdoored data close to an attack target class, thus fooling the downstream predictor to misclassify it as the target. Existing attacks usually adopt a fixed trigger pattern and poison the training set with trigger-injected data, hoping for the feature extractor to learn the association between trigger and target class. However, we find that such fixed trigger design fails to effectively associate trigger-injected data with target class in the embedding space due to special CL mechanisms, leading to a limited attack success rate (ASR). This phenomenon motivates us to find a better backdoor trigger design tailored for CL framework. In this paper, we propose a bi-level optimization approach to achieve this goal, where the inner optimization simulates the CL dynamics of a surrogate victim, and the outer optimization enforces the backdoor trigger to stay close to the target throughout the surrogate CL procedure. Extensive experiments show that our attack can achieve a higher attack success rate (e.g., $99\%$ ASR on ImageNet-100) with a very low poisoning rate ($1\%$). Besides, our attack can effectively evade existing state-of-the-art defenses. Code is available at: https://github.com/SWY666/SSL-backdoor-BLTO.

Backdoor Contrastive Learning via Bi-level Trigger Optimization

TL;DR

Contrastive learning pretraining is vulnerable to backdoor triggers, but fixed triggers fail to reliably bind backdoored inputs to the target class under CL dynamics. The authors propose Bi-Level Trigger Optimization (BLTO), which uses an inner surrogate CL procedure and an outer trigger-optimizer to produce triggers that remain close to the target cluster in embedding space during training. BLTO achieves high attack success rates at a low poisoning rate (e.g., ) across CIFAR-10/100 and ImageNet-100, with transferability across CL strategies and resilience to several defenses, while offering analysis of alignment and uniformity interactions. These results reveal a practical security risk in CL pipelines and motivate development of defenses tailored to SSL/CL systems.

Abstract

Contrastive Learning (CL) has attracted enormous attention due to its remarkable capability in unsupervised representation learning. However, recent works have revealed the vulnerability of CL to backdoor attacks: the feature extractor could be misled to embed backdoored data close to an attack target class, thus fooling the downstream predictor to misclassify it as the target. Existing attacks usually adopt a fixed trigger pattern and poison the training set with trigger-injected data, hoping for the feature extractor to learn the association between trigger and target class. However, we find that such fixed trigger design fails to effectively associate trigger-injected data with target class in the embedding space due to special CL mechanisms, leading to a limited attack success rate (ASR). This phenomenon motivates us to find a better backdoor trigger design tailored for CL framework. In this paper, we propose a bi-level optimization approach to achieve this goal, where the inner optimization simulates the CL dynamics of a surrogate victim, and the outer optimization enforces the backdoor trigger to stay close to the target throughout the surrogate CL procedure. Extensive experiments show that our attack can achieve a higher attack success rate (e.g., ASR on ImageNet-100) with a very low poisoning rate (). Besides, our attack can effectively evade existing state-of-the-art defenses. Code is available at: https://github.com/SWY666/SSL-backdoor-BLTO.
Paper Structure (25 sections, 3 equations, 10 figures, 16 tables, 3 algorithms)

This paper contains 25 sections, 3 equations, 10 figures, 16 tables, 3 algorithms.

Figures (10)

  • Figure 1: Left: normalized similarity between the trigger cluster and the target cluster, when performing SimCLR on data poisoned by different attacks; Right: downstream attack success rate.
  • Figure 2: The overview of our proposed BLTO : inner optimization simulates the CL dynamics of a surrogate victim; outer optimization finds a backdoor generator that can adapt to such CL dynamics.
  • Figure 3: Visualizing data embeddings of victim's feature extractor backdoored by different attacks.
  • Figure 4: Alignment and uniformity during backdoor training on SimCLR.
  • Figure 5: Original image (a), backdoored image (b), and their difference is the trigger (c).
  • ...and 5 more figures