Table of Contents
Fetching ...

On the Volatility of Shapley-Based Contribution Metrics in Federated Learning

Arno Geimer, Beltran Fiz, Radu State

TL;DR

This work examines whether Shapley-value-based contribution metrics in Federated Learning yield stable and fair rewards across common aggregation strategies. Using gradient-based One-Round Reconstruction on four image datasets with Dirichlet non-IID splits, it compares round-based contributions under eight aggregation methods to a size-based ground truth. Results show that while aggregate contributions often align with ground truth, per-client rewards are highly sensitive to the chosen aggregation strategy, exposing serious stability and fairness concerns. The authors discuss implications for trust in FL ecosystems and propose directions such as ensemble aggregation and scalable Shapley approximations for future work to enhance robustness and practicality.

Abstract

Federated learning (FL) is a collaborative and privacy-preserving Machine Learning paradigm, allowing the development of robust models without the need to centralize sensitive data. A critical challenge in FL lies in fairly and accurately allocating contributions from diverse participants. Inaccurate allocation can undermine trust, lead to unfair compensation, and thus participants may lack the incentive to join or actively contribute to the federation. Various remuneration strategies have been proposed to date, including auction-based approaches and Shapley-value-based methods, the latter offering a means to quantify the contribution of each participant. However, little to no work has studied the stability of these contribution evaluation methods. In this paper, we evaluate participant contributions in federated learning using gradient-based model reconstruction techniques with Shapley values and compare the round-based contributions to a classic data contribution measurement scheme. We provide an extensive analysis of the discrepancies of Shapley values across a set of aggregation strategies and examine them on an overall and a per-client level. We show that, between different aggregation techniques, Shapley values lead to unstable reward allocations among participants. Our analysis spans various data heterogeneity distributions, including independent and identically distributed (IID) and non-IID scenarios.

On the Volatility of Shapley-Based Contribution Metrics in Federated Learning

TL;DR

This work examines whether Shapley-value-based contribution metrics in Federated Learning yield stable and fair rewards across common aggregation strategies. Using gradient-based One-Round Reconstruction on four image datasets with Dirichlet non-IID splits, it compares round-based contributions under eight aggregation methods to a size-based ground truth. Results show that while aggregate contributions often align with ground truth, per-client rewards are highly sensitive to the chosen aggregation strategy, exposing serious stability and fairness concerns. The authors discuss implications for trust in FL ecosystems and propose directions such as ensemble aggregation and scalable Shapley approximations for future work to enhance robustness and practicality.

Abstract

Federated learning (FL) is a collaborative and privacy-preserving Machine Learning paradigm, allowing the development of robust models without the need to centralize sensitive data. A critical challenge in FL lies in fairly and accurately allocating contributions from diverse participants. Inaccurate allocation can undermine trust, lead to unfair compensation, and thus participants may lack the incentive to join or actively contribute to the federation. Various remuneration strategies have been proposed to date, including auction-based approaches and Shapley-value-based methods, the latter offering a means to quantify the contribution of each participant. However, little to no work has studied the stability of these contribution evaluation methods. In this paper, we evaluate participant contributions in federated learning using gradient-based model reconstruction techniques with Shapley values and compare the round-based contributions to a classic data contribution measurement scheme. We provide an extensive analysis of the discrepancies of Shapley values across a set of aggregation strategies and examine them on an overall and a per-client level. We show that, between different aggregation techniques, Shapley values lead to unstable reward allocations among participants. Our analysis spans various data heterogeneity distributions, including independent and identically distributed (IID) and non-IID scenarios.
Paper Structure (20 sections, 1 theorem, 4 equations, 3 figures, 3 tables, 1 algorithm)

This paper contains 20 sections, 1 theorem, 4 equations, 3 figures, 3 tables, 1 algorithm.

Key Result

Lemma 1

Let a data set $D$ be split by size into $n$ different subsets $D_1, ..., D_n$ following a Dirichlet distribution $Dir((\alpha, ..., \alpha))$. Then, an equal payout differs from a size-based payout, on average, by $d = \frac{n-1}{n^2 \alpha + n}$ under the squared Euclidean distance.

Figures (3)

  • Figure 1: Density plots of the best value for $R$, as percentage of total rounds, for select strategies. An optimal value minimizes the distance to the ground truth in a run.
  • Figure 2: The distributions of client-wise contribution differences between strategies across all experiments. A value of 0.1 represents a client receiving 10% more of the total reward using one strategy instead of another.
  • Figure 3: Histograms show contribution differences between pairs of strategies. X-axes are fixed across all plots.

Theorems & Definitions (2)

  • Lemma 1
  • proof