Table of Contents
Fetching ...

Logits Poisoning Attack in Federated Distillation

Yuhan Tang, Zhiyuan Wu, Bo Gao, Tian Wen, Yuwei Wang, Sheng Sun

TL;DR

This paper addresses the security vulnerability of Federated Distillation (FD) to logits poisoning attacks by introducing FDLA, a method that subtly manipulates the confidence distribution of logits exchanged during FD. FDLA sorts and reorders class confidences via a ranking function $r(\cdot)$ and a transformation $t(\cdot)$ to produce a contaminated logit vector $c'$, guiding local models toward incorrect yet plausible predictions without touching private data. Empirical evaluation on CIFAR-10 and SVHN under FD and FedCache shows that FDLA degrades accuracy more than baseline poisoning methods, with robustness across data heterogeneity, client count, and model architectures. The results highlight a critical need for defenses against FD-specific threats in cross-device knowledge transfer and establish a baseline for evaluating FD defenses.

Abstract

Federated Distillation (FD) is a novel and promising distributed machine learning paradigm, where knowledge distillation is leveraged to facilitate a more efficient and flexible cross-device knowledge transfer in federated learning. By optimizing local models with knowledge distillation, FD circumvents the necessity of uploading large-scale model parameters to the central server, simultaneously preserving the raw data on local clients. Despite the growing popularity of FD, there is a noticeable gap in previous works concerning the exploration of poisoning attacks within this framework. This can lead to a scant understanding of the vulnerabilities to potential adversarial actions. To this end, we introduce FDLA, a poisoning attack method tailored for FD. FDLA manipulates logit communications in FD, aiming to significantly degrade model performance on clients through misleading the discrimination of private samples. Through extensive simulation experiments across a variety of datasets, attack scenarios, and FD configurations, we demonstrate that LPA effectively compromises client model accuracy, outperforming established baseline algorithms in this regard. Our findings underscore the critical need for robust defense mechanisms in FD settings to mitigate such adversarial threats.

Logits Poisoning Attack in Federated Distillation

TL;DR

This paper addresses the security vulnerability of Federated Distillation (FD) to logits poisoning attacks by introducing FDLA, a method that subtly manipulates the confidence distribution of logits exchanged during FD. FDLA sorts and reorders class confidences via a ranking function and a transformation to produce a contaminated logit vector , guiding local models toward incorrect yet plausible predictions without touching private data. Empirical evaluation on CIFAR-10 and SVHN under FD and FedCache shows that FDLA degrades accuracy more than baseline poisoning methods, with robustness across data heterogeneity, client count, and model architectures. The results highlight a critical need for defenses against FD-specific threats in cross-device knowledge transfer and establish a baseline for evaluating FD defenses.

Abstract

Federated Distillation (FD) is a novel and promising distributed machine learning paradigm, where knowledge distillation is leveraged to facilitate a more efficient and flexible cross-device knowledge transfer in federated learning. By optimizing local models with knowledge distillation, FD circumvents the necessity of uploading large-scale model parameters to the central server, simultaneously preserving the raw data on local clients. Despite the growing popularity of FD, there is a noticeable gap in previous works concerning the exploration of poisoning attacks within this framework. This can lead to a scant understanding of the vulnerabilities to potential adversarial actions. To this end, we introduce FDLA, a poisoning attack method tailored for FD. FDLA manipulates logit communications in FD, aiming to significantly degrade model performance on clients through misleading the discrimination of private samples. Through extensive simulation experiments across a variety of datasets, attack scenarios, and FD configurations, we demonstrate that LPA effectively compromises client model accuracy, outperforming established baseline algorithms in this regard. Our findings underscore the critical need for robust defense mechanisms in FD settings to mitigate such adversarial threats.
Paper Structure (15 sections, 6 equations, 3 figures, 5 tables, 2 algorithms)

This paper contains 15 sections, 6 equations, 3 figures, 5 tables, 2 algorithms.

Figures (3)

  • Figure 1: Illustration of how attackers manipulates uploaded knowledge.
  • Figure 2: Convergence impact of three attack types on FD and FedCache in CIFAR-10 with 30% attackers, tracking average accuracy per communication round.
  • Figure 3: Illustration of misleading effect of FDLA to client models. The vertical axis of the chart represents the probability statistics of the test results for cat samples, while the horizontal axis represents the categories of the test results. The left figure shows the predictive statistics of all models on the test set without tampering, while the right figure displays the statistical results after FDLA manipulation.