Table of Contents
Fetching ...

Transferable Availability Poisoning Attacks

Yiyong Liu, Michael Backes, Xiao Zhang

TL;DR

This work addresses the realism gap in data poisoning by showing that availability attacks designed for a single learning method poorly transfer when victims can choose any algorithm, including contrastive and supervised learners. It introduces Transferable Poisoning (TP), which first exploits alignment and uniformity from contrastive learning to craft poisons with strong intra-paradigm transfer, and then iteratively combines gradient signals from supervised and contrastive objectives via a shared backbone to approximate worst-case unlearnability across paradigms. TP demonstrates superior cross-learner transferability on CIFAR-10/100, TinyImageNet, and MiniImageNet, with robust performance across architectures and under partial poisoning; it highlights that cross-paradigm poisoning can be substantially more threatening than previously thought. The results motivate developing defenses that account for cross-method transfer and cross-paradigm robustness in data poisoning scenarios.

Abstract

We consider availability data poisoning attacks, where an adversary aims to degrade the overall test accuracy of a machine learning model by crafting small perturbations to its training data. Existing poisoning strategies can achieve the attack goal but assume the victim to employ the same learning method as what the adversary uses to mount the attack. In this paper, we argue that this assumption is strong, since the victim may choose any learning algorithm to train the model as long as it can achieve some targeted performance on clean data. Empirically, we observe a large decrease in the effectiveness of prior poisoning attacks if the victim employs an alternative learning algorithm. To enhance the attack transferability, we propose Transferable Poisoning, which first leverages the intrinsic characteristics of alignment and uniformity to enable better unlearnability within contrastive learning, and then iteratively utilizes the gradient information from supervised and unsupervised contrastive learning paradigms to generate the poisoning perturbations. Through extensive experiments on image benchmarks, we show that our transferable poisoning attack can produce poisoned samples with significantly improved transferability, not only applicable to the two learners used to devise the attack but also to learning algorithms and even paradigms beyond.

Transferable Availability Poisoning Attacks

TL;DR

This work addresses the realism gap in data poisoning by showing that availability attacks designed for a single learning method poorly transfer when victims can choose any algorithm, including contrastive and supervised learners. It introduces Transferable Poisoning (TP), which first exploits alignment and uniformity from contrastive learning to craft poisons with strong intra-paradigm transfer, and then iteratively combines gradient signals from supervised and contrastive objectives via a shared backbone to approximate worst-case unlearnability across paradigms. TP demonstrates superior cross-learner transferability on CIFAR-10/100, TinyImageNet, and MiniImageNet, with robust performance across architectures and under partial poisoning; it highlights that cross-paradigm poisoning can be substantially more threatening than previously thought. The results motivate developing defenses that account for cross-method transfer and cross-paradigm robustness in data poisoning scenarios.

Abstract

We consider availability data poisoning attacks, where an adversary aims to degrade the overall test accuracy of a machine learning model by crafting small perturbations to its training data. Existing poisoning strategies can achieve the attack goal but assume the victim to employ the same learning method as what the adversary uses to mount the attack. In this paper, we argue that this assumption is strong, since the victim may choose any learning algorithm to train the model as long as it can achieve some targeted performance on clean data. Empirically, we observe a large decrease in the effectiveness of prior poisoning attacks if the victim employs an alternative learning algorithm. To enhance the attack transferability, we propose Transferable Poisoning, which first leverages the intrinsic characteristics of alignment and uniformity to enable better unlearnability within contrastive learning, and then iteratively utilizes the gradient information from supervised and unsupervised contrastive learning paradigms to generate the poisoning perturbations. Through extensive experiments on image benchmarks, we show that our transferable poisoning attack can produce poisoned samples with significantly improved transferability, not only applicable to the two learners used to devise the attack but also to learning algorithms and even paradigms beyond.
Paper Structure (22 sections, 6 equations, 6 figures, 10 tables, 1 algorithm)

This paper contains 22 sections, 6 equations, 6 figures, 10 tables, 1 algorithm.

Figures (6)

  • Figure 1: Test accuracy (%) of victim model trained by different supervised and unsupervised contrastive learning algorithms on clean and various types of poisoned data. Here, "CP-S", "CP-B", "CP-M" and "CP-A&U" stand for contrastive poisoning with SimCLR, BYOL, MoCo, alignment and uniformity loss, respectively. All the models are trained on ResNet-18 and CIFAR-10.
  • Figure 2: Effect of training epochs for generating poisons on CIFAR-10.
  • Figure 3: Effect of partial poisoning on CIFAR-10. "Ours" uses the entire training data with the poisoning proportion adjusted, while "Clean Only" merely uses the rest clean training data.
  • Figure 4: Training process for supervised learning and linear probing process after the pretraining under SimCLR for clean and various attacking methods on CIFAR-10.
  • Figure 5: t-SNE of the generated generated noises from EM, TAP, CP-BYOL, TUE, T-AAP and Ours on CIFAR-10.
  • ...and 1 more figures