Table of Contents
Fetching ...

A Game-Theoretic Framework for Privacy-Aware Client Sampling in Federated Learning

Wenhao Yuan, Xuehe Wang

TL;DR

This paper tackles privacy-preserving federated learning by jointly optimizing time-varying client sampling and privacy budgets. It introduces FedPCS, which models server–client interactions as a two-stage Stackelberg game and uses a mean-field estimator to overcome private information gaps, providing closed-form strategies and convergence guarantees. The authors derive upper bounds on accuracy loss, prove the existence and convergence to a Stackelberg Nash Equilibrium, and analyze efficiency via Price of Anarchy, showing substantial improvements over random sampling. Adaptive extensions address dynamic privacy constraints, yielding closed-form adaptive sampling and reward rules, and experiments across multiple datasets validate the framework's superiority under IID and Non-IID settings with robust convergence and privacy usage. Overall, FedPCS offers a scalable, theory-grounded approach for privacy-aware client selection and incentive design in federated learning.

Abstract

This paper aims to design a Privacy-aware Client Sampling framework in Federated learning, named FedPCS, to tackle the heterogeneous client sampling issues and improve model performance. First, we obtain a pioneering upper bound for the accuracy loss of the FL model with privacy-aware client sampling probabilities. Based on this, we model the interactions between the central server and participating clients as a two-stage Stackelberg game. In Stage I, the central server designs the optimal time-dependent reward for cost minimization by considering the trade-off between the accuracy loss of the FL model and the rewards allocated. In Stage II, each client determines the correction factor that dynamically adjusts its privacy budget based on the reward allocated to maximize its utility. To surmount the obstacle of approximating other clients' private information, we introduce the mean-field estimator to estimate the average privacy budget. We analytically demonstrate the existence and convergence of the fixed point for the mean-field estimator and derive the Stackelberg Nash Equilibrium to obtain the optimal strategy profile. By rigorously theoretical convergence analysis, we guarantee the robustness of FedPCS. Moreover, considering the conventional sampling strategy in privacy-preserving FL, we prove that the random sampling approach's PoA can be arbitrarily large. To remedy such efficiency loss, we show that the proposed privacy-aware client sampling strategy successfully reduces PoA, which is upper bounded by a reachable constant. To address the challenge of varying privacy requirements throughout different training phases in FL, we extend our model and analysis and derive the adaptive optimal sampling ratio for the central server. Experimental results on different datasets demonstrate the superiority of FedPCS compared with the existing SOTA FL strategies under IID and Non-IID datasets.

A Game-Theoretic Framework for Privacy-Aware Client Sampling in Federated Learning

TL;DR

This paper tackles privacy-preserving federated learning by jointly optimizing time-varying client sampling and privacy budgets. It introduces FedPCS, which models server–client interactions as a two-stage Stackelberg game and uses a mean-field estimator to overcome private information gaps, providing closed-form strategies and convergence guarantees. The authors derive upper bounds on accuracy loss, prove the existence and convergence to a Stackelberg Nash Equilibrium, and analyze efficiency via Price of Anarchy, showing substantial improvements over random sampling. Adaptive extensions address dynamic privacy constraints, yielding closed-form adaptive sampling and reward rules, and experiments across multiple datasets validate the framework's superiority under IID and Non-IID settings with robust convergence and privacy usage. Overall, FedPCS offers a scalable, theory-grounded approach for privacy-aware client selection and incentive design in federated learning.

Abstract

This paper aims to design a Privacy-aware Client Sampling framework in Federated learning, named FedPCS, to tackle the heterogeneous client sampling issues and improve model performance. First, we obtain a pioneering upper bound for the accuracy loss of the FL model with privacy-aware client sampling probabilities. Based on this, we model the interactions between the central server and participating clients as a two-stage Stackelberg game. In Stage I, the central server designs the optimal time-dependent reward for cost minimization by considering the trade-off between the accuracy loss of the FL model and the rewards allocated. In Stage II, each client determines the correction factor that dynamically adjusts its privacy budget based on the reward allocated to maximize its utility. To surmount the obstacle of approximating other clients' private information, we introduce the mean-field estimator to estimate the average privacy budget. We analytically demonstrate the existence and convergence of the fixed point for the mean-field estimator and derive the Stackelberg Nash Equilibrium to obtain the optimal strategy profile. By rigorously theoretical convergence analysis, we guarantee the robustness of FedPCS. Moreover, considering the conventional sampling strategy in privacy-preserving FL, we prove that the random sampling approach's PoA can be arbitrarily large. To remedy such efficiency loss, we show that the proposed privacy-aware client sampling strategy successfully reduces PoA, which is upper bounded by a reachable constant. To address the challenge of varying privacy requirements throughout different training phases in FL, we extend our model and analysis and derive the adaptive optimal sampling ratio for the central server. Experimental results on different datasets demonstrate the superiority of FedPCS compared with the existing SOTA FL strategies under IID and Non-IID datasets.

Paper Structure

This paper contains 39 sections, 15 theorems, 76 equations, 15 figures, 4 tables, 3 algorithms.

Key Result

Proposition 1

At $t$-th global iteration, by leveraging the Gaussian distribution-based $\rho$-$z$CDP technique to perturb the transmitted local parameter, the variance of the Gaussian random noise $\sigma_{i}^{2}(t)$ of $i$-th client is derived as $\sigma_{i}^{2}(t) \!=\! \frac{2 W^{2}}{\rho_{i}^{t} |\mathcal{D}

Figures (15)

  • Figure 1: The framework of FedPCS: A $\rho$-$z$CDP technique-based incentive mechanism architecture in FL, where the blue dashed outline indicates the model aggregation and reward allocation operated by the central server, and the black dashed outline indicates the local training, model disturbance, and upload process.
  • Figure 2: The data distribution on the CIFAR-10 dataset under IID and Non-IID settings with the client sampling rate $\tau = 0.2$.
  • Figure 3: The iterative convergence for the mean-field estimator $\phi(t)$.
  • Figure 4: The Pearson correlation coefficient of various schemes on Fashion-MNIST dataset with sampling rate $\tau = 0.2$.
  • Figure 5: The social welfare comparison on Fashion-MNIST/CIFAR-10/SVHN/CIFAR-100/CINIC-10/Tiny-ImageNet datasets under different client sampling strategies.
  • ...and 10 more figures

Theorems & Definitions (21)

  • Definition 1
  • Proposition 1
  • Lemma 1
  • Proposition 2
  • Definition 2
  • Theorem 1
  • Lemma 2
  • Definition 3
  • Theorem 2
  • Theorem 3
  • ...and 11 more