Table of Contents
Fetching ...

Federated Learning With Individualized Privacy Through Client Sampling

Lucas Lange, Ole Borchardt, Erhard Rahm

TL;DR

This work tackles privacy in Federated Learning where users have heterogeneous privacy preferences. It introduces IDP-FedAvg, an adaptation of the central SAMPLE sampling approach to FL, computing per-group client sampling rates from privacy budgets and applying a global noise multiplier to the sampled set. The method shows improved privacy-utility trade-offs over uniform DP and over the SCALE approach across FMNIST, CIFAR-10, and SVHN under realistic privacy distributions, though performance degrades for small, complex, non-i.i.d. tasks like CIFAR-10. Overall, the approach enables user-controlled privacy in FL with practical utility gains, and points to future exploration of IDP with other aggregation schemes and real-world FL challenges. $\varepsilon$ budgets and $\sigma_{\text{SAMPLE}}$ play central roles in balancing privacy and performance.$

Abstract

With growing concerns about user data collection, individualized privacy has emerged as a promising solution to balance protection and utility by accounting for diverse user privacy preferences. Instead of enforcing a uniform level of anonymization for all users, this approach allows individuals to choose privacy settings that align with their comfort levels. Building on this idea, we propose an adapted method for enabling Individualized Differential Privacy (IDP) in Federated Learning (FL) by handling clients according to their personal privacy preferences. By extending the SAMPLE algorithm from centralized settings to FL, we calculate client-specific sampling rates based on their heterogeneous privacy budgets and integrate them into a modified IDP-FedAvg algorithm. We test this method under realistic privacy distributions and multiple datasets. The experimental results demonstrate that our approach achieves clear improvements over uniform DP baselines, reducing the trade-off between privacy and utility. Compared to the alternative SCALE method in related work, which assigns differing noise scales to clients, our method performs notably better. However, challenges remain for complex tasks with non-i.i.d. data, primarily stemming from the constraints of the decentralized setting.

Federated Learning With Individualized Privacy Through Client Sampling

TL;DR

This work tackles privacy in Federated Learning where users have heterogeneous privacy preferences. It introduces IDP-FedAvg, an adaptation of the central SAMPLE sampling approach to FL, computing per-group client sampling rates from privacy budgets and applying a global noise multiplier to the sampled set. The method shows improved privacy-utility trade-offs over uniform DP and over the SCALE approach across FMNIST, CIFAR-10, and SVHN under realistic privacy distributions, though performance degrades for small, complex, non-i.i.d. tasks like CIFAR-10. Overall, the approach enables user-controlled privacy in FL with practical utility gains, and points to future exploration of IDP with other aggregation schemes and real-world FL challenges. budgets and play central roles in balancing privacy and performance.$

Abstract

With growing concerns about user data collection, individualized privacy has emerged as a promising solution to balance protection and utility by accounting for diverse user privacy preferences. Instead of enforcing a uniform level of anonymization for all users, this approach allows individuals to choose privacy settings that align with their comfort levels. Building on this idea, we propose an adapted method for enabling Individualized Differential Privacy (IDP) in Federated Learning (FL) by handling clients according to their personal privacy preferences. By extending the SAMPLE algorithm from centralized settings to FL, we calculate client-specific sampling rates based on their heterogeneous privacy budgets and integrate them into a modified IDP-FedAvg algorithm. We test this method under realistic privacy distributions and multiple datasets. The experimental results demonstrate that our approach achieves clear improvements over uniform DP baselines, reducing the trade-off between privacy and utility. Compared to the alternative SCALE method in related work, which assigns differing noise scales to clients, our method performs notably better. However, challenges remain for complex tasks with non-i.i.d. data, primarily stemming from the constraints of the decentralized setting.

Paper Structure

This paper contains 15 sections, 1 equation, 2 figures, 1 table, 2 algorithms.

Figures (2)

  • Figure 1: Examples of label distribution on clients for our datasets (non-i.i.d.).
  • Figure 2: CNN model architecture used in our experiments.