Table of Contents
Fetching ...

ACE: A Model Poisoning Attack on Contribution Evaluation Methods in Federated Learning

Zhangchen Xu, Fengqing Jiang, Luyao Niu, Jinyuan Jia, Bo Li, Radha Poovendran

TL;DR

The paper identifies a vulnerability in contribution-evaluation methods used in federated learning, where malicious clients can gain incentive-based rewards by manipulating local updates. It introduces ACE, a model-poisoning attack that predicts future global model updates via the Cauchy mean value theorem and uses an L-BFGS Hessian-vector approximation, plus a threshold-based mitigation and optional amplification to boost perceived contributions. Theoretical results show amplification under cosine-distance contributions can non-decreasingly improve malicious clients' standing, while extensive experiments demonstrate ACE's effectiveness across multiple evaluation methods and data-partition settings, all while preserving final model accuracy. Countermeasures based on distance or statistical defenses offer little protection, underscoring the need for new defenses to secure contribution evaluation in FL and safeguard incentive mechanisms.

Abstract

In Federated Learning (FL), a set of clients collaboratively train a machine learning model (called global model) without sharing their local training data. The local training data of clients is typically non-i.i.d. and heterogeneous, resulting in varying contributions from individual clients to the final performance of the global model. In response, many contribution evaluation methods were proposed, where the server could evaluate the contribution made by each client and incentivize the high-contributing clients to sustain their long-term participation in FL. Existing studies mainly focus on developing new metrics or algorithms to better measure the contribution of each client. However, the security of contribution evaluation methods of FL operating in adversarial environments is largely unexplored. In this paper, we propose the first model poisoning attack on contribution evaluation methods in FL, termed ACE. Specifically, we show that any malicious client utilizing ACE could manipulate the parameters of its local model such that it is evaluated to have a high contribution by the server, even when its local training data is indeed of low quality. We perform both theoretical analysis and empirical evaluations of ACE. Theoretically, we show our design of ACE can effectively boost the malicious client's perceived contribution when the server employs the widely-used cosine distance metric to measure contribution. Empirically, our results show ACE effectively and efficiently deceive five state-of-the-art contribution evaluation methods. In addition, ACE preserves the accuracy of the final global models on testing inputs. We also explore six countermeasures to defend ACE. Our results show they are inadequate to thwart ACE, highlighting the urgent need for new defenses to safeguard the contribution evaluation methods in FL.

ACE: A Model Poisoning Attack on Contribution Evaluation Methods in Federated Learning

TL;DR

The paper identifies a vulnerability in contribution-evaluation methods used in federated learning, where malicious clients can gain incentive-based rewards by manipulating local updates. It introduces ACE, a model-poisoning attack that predicts future global model updates via the Cauchy mean value theorem and uses an L-BFGS Hessian-vector approximation, plus a threshold-based mitigation and optional amplification to boost perceived contributions. Theoretical results show amplification under cosine-distance contributions can non-decreasingly improve malicious clients' standing, while extensive experiments demonstrate ACE's effectiveness across multiple evaluation methods and data-partition settings, all while preserving final model accuracy. Countermeasures based on distance or statistical defenses offer little protection, underscoring the need for new defenses to secure contribution evaluation in FL and safeguard incentive mechanisms.

Abstract

In Federated Learning (FL), a set of clients collaboratively train a machine learning model (called global model) without sharing their local training data. The local training data of clients is typically non-i.i.d. and heterogeneous, resulting in varying contributions from individual clients to the final performance of the global model. In response, many contribution evaluation methods were proposed, where the server could evaluate the contribution made by each client and incentivize the high-contributing clients to sustain their long-term participation in FL. Existing studies mainly focus on developing new metrics or algorithms to better measure the contribution of each client. However, the security of contribution evaluation methods of FL operating in adversarial environments is largely unexplored. In this paper, we propose the first model poisoning attack on contribution evaluation methods in FL, termed ACE. Specifically, we show that any malicious client utilizing ACE could manipulate the parameters of its local model such that it is evaluated to have a high contribution by the server, even when its local training data is indeed of low quality. We perform both theoretical analysis and empirical evaluations of ACE. Theoretically, we show our design of ACE can effectively boost the malicious client's perceived contribution when the server employs the widely-used cosine distance metric to measure contribution. Empirically, our results show ACE effectively and efficiently deceive five state-of-the-art contribution evaluation methods. In addition, ACE preserves the accuracy of the final global models on testing inputs. We also explore six countermeasures to defend ACE. Our results show they are inadequate to thwart ACE, highlighting the urgent need for new defenses to safeguard the contribution evaluation methods in FL.
Paper Structure (33 sections, 3 theorems, 17 equations, 9 figures, 10 tables, 2 algorithms)

This paper contains 33 sections, 3 theorems, 17 equations, 9 figures, 10 tables, 2 algorithms.

Key Result

Proposition 1

Let $\mathbf{g'} = \mathcal{A}(\mathbf{g}_1, \cdots, c \hat{\mathbf{g}}_i, \cdots , \mathbf{g}_N)$ and $\mathbf{g} = \mathcal{A}(\mathbf{g}_1, \cdots, \hat{\mathbf{g}}_i, \cdots , \mathbf{g}_N)$ be the global model updates obtained using the predicted global model update $c \hat{\mathbf{g}}$ and $\

Figures (9)

  • Figure 1: An illustration of ACE consisting of two components: future global model prediction and prediction error mitigation.
  • Figure 2: Comparing the contribution score $CS$ and rank gain $\Delta R$ of the attacker when using ACE and baselines under three datasets, i.e., MNIST (first row), CIFAR-10 (second row), and Tiny-ImageNet (third row), and five contribution evaluation methods, i.e., FedSV, LOO, CFFL, GDR, and RFFL. The data partition method is CLA (a heterogeneous setting). Our results show ACE is more effective than baselines. The results for data partitions UNI and POW are in Figure \ref{['fig:cs-vs-rank-UNI']} and \ref{['fig:cs-vs-rank-POW']} of Appendix \ref{['app:additional exp']}.
  • Figure 3: Comparing the contribution score $CS$ and rank gain $\Delta R$ of the attacker when using ACE and baselines under three datasets, i.e., MNIST (first row), CIFAR-10 (second row), and Tiny-ImageNet (third row), and five contribution evaluation methods, i.e., FedSV, LOO, CFFL, GDR, and RFFL. The data partition method is UNI (i.i.d. data distribution). Our results show ACE is more effective than baselines under most of the contribution evaluation methods.
  • Figure 4: Comparing the contribution score $CS$ and rank gain $\Delta R$ of the attacker when using ACE and baselines under three datasets, i.e., MNIST (first row), CIFAR-10 (second row), and Tiny-ImageNet (third row), and five contribution evaluation methods, i.e., FedSV, LOO, CFFL, GDR, and RFFL. The data partition method is POW (non-i.i.d. data distribution). Our results show ACE is consistently more effective than baselines.
  • Figure 5: Ablation study with different buffer lengths $m=2,3,\ldots,5$. The numbers annotated in the figure are the rank gains. ACE with non-zero buffer lengths significantly improves the rank gain, without degrading the accuracy. AF is the abbreviation for Attack Free.
  • ...and 4 more figures

Theorems & Definitions (6)

  • Proposition 1
  • Remark 1
  • Corollary 1
  • Remark 2
  • Proposition 2
  • Remark 3