Table of Contents
Fetching ...

PoiCGAN: A Targeted Poisoning Based on Feature-Label Joint Perturbation in Federated Learning

Tao Liu, Jiguang Lv, Dapeng Man, Weiye Xi, Yaole Li, Feiyu Zhao, Kuiming Wang, Yingchao Bian, Chen Xu, Wu Yang

Abstract

Federated Learning (FL), as a popular distributed learning paradigm, has shown outstanding performance in improving computational efficiency and protecting data privacy, and is widely applied in industrial image classification. However, due to its distributed nature, FL is vulnerable to threats from malicious clients, with poisoning attacks being a common threat. A major limitation of existing poisoning attack methods is their difficulty in bypassing model performance tests and defense mechanisms based on model anomaly detection. This often results in the detection and removal of poisoned models, which undermines their practical utility. To ensure both the performance of industrial image classification and attacks, we propose a targeted poisoning attack, PoiCGAN, based on feature-label collaborative perturbation. Our method modifies the inputs of the discriminator and generator in the Conditional Generative Adversarial Network (CGAN) to influence the training process, generating an ideal poison generator. This generator not only produces specific poisoned samples but also automatically performs label flipping. Experiments across various datasets show that our method achieves an attack success rate 83.97% higher than baseline methods, with a less than 8.87% reduction in the main task's accuracy. Moreover, the poisoned samples and malicious models exhibit high stealthiness.

PoiCGAN: A Targeted Poisoning Based on Feature-Label Joint Perturbation in Federated Learning

Abstract

Federated Learning (FL), as a popular distributed learning paradigm, has shown outstanding performance in improving computational efficiency and protecting data privacy, and is widely applied in industrial image classification. However, due to its distributed nature, FL is vulnerable to threats from malicious clients, with poisoning attacks being a common threat. A major limitation of existing poisoning attack methods is their difficulty in bypassing model performance tests and defense mechanisms based on model anomaly detection. This often results in the detection and removal of poisoned models, which undermines their practical utility. To ensure both the performance of industrial image classification and attacks, we propose a targeted poisoning attack, PoiCGAN, based on feature-label collaborative perturbation. Our method modifies the inputs of the discriminator and generator in the Conditional Generative Adversarial Network (CGAN) to influence the training process, generating an ideal poison generator. This generator not only produces specific poisoned samples but also automatically performs label flipping. Experiments across various datasets show that our method achieves an attack success rate 83.97% higher than baseline methods, with a less than 8.87% reduction in the main task's accuracy. Moreover, the poisoned samples and malicious models exhibit high stealthiness.
Paper Structure (30 sections, 13 equations, 6 figures, 6 tables, 1 algorithm)

This paper contains 30 sections, 13 equations, 6 figures, 6 tables, 1 algorithm.

Figures (6)

  • Figure 1: Workflow of PoiCGAN and core module PSG. The left side of the figure shows PoiCGAN's workflow, comprising four steps: (1) poisoned sample generation, (2) local model training, (3) infection of the global model, and (4) model inference. In step (1), PSG generates poisoned samples by incorporating the target label as conditional input into both the discriminator and generator, as shown on the right side. The discriminator training includes: (a) "real source images + target label" to output real, (b) "real images from other classes" to output real, and (c) "fake images + target label" to output fake. The goal of (a) is to mislead the discriminator, guiding the generator to flip the label and produce source class images when conditioned on the target label.
  • Figure 2: Comparison of attack performance and main task performance across different methods on various datasets.
  • Figure 3: Visualization of the 2D local model on InsPLAD.
  • Figure 4: Effect of poisoned model rate on attack and main task performance.
  • Figure 5: Effect of training rounds on attack and main task performance.
  • ...and 1 more figures