Table of Contents
Fetching ...

Controlled privacy leakage propagation throughout overlapping grouped learning

Shahrzad Kiani, Franziska Boenisch, Stark C. Draper

TL;DR

This work proposes differential private overlapping grouped learning (DP-OGL), a novel method to implement privacy guarantees within overlapping groups and derives novel privacy guarantees between arbitrary pairs of workers under the honest-but-curious threat model.

Abstract

Federated Learning (FL) is the standard protocol for collaborative learning. In FL, multiple workers jointly train a shared model. They exchange model updates calculated on their data, while keeping the raw data itself local. Since workers naturally form groups based on common interests and privacy policies, we are motivated to extend standard FL to reflect a setting with multiple, potentially overlapping groups. In this setup where workers can belong and contribute to more than one group at a time, complexities arise in understanding privacy leakage and in adhering to privacy policies. To address the challenges, we propose differential private overlapping grouped learning (DPOGL), a novel method to implement privacy guarantees within overlapping groups. Under the honest-but-curious threat model, we derive novel privacy guarantees between arbitrary pairs of workers. These privacy guarantees describe and quantify two key effects of privacy leakage in DP-OGL: propagation delay, i.e., the fact that information from one group will leak to other groups only with temporal offset through the common workers and information degradation, i.e., the fact that noise addition over model updates limits information leakage between workers. Our experiments show that applying DP-OGL enhances utility while maintaining strong privacy compared to standard FL setups.

Controlled privacy leakage propagation throughout overlapping grouped learning

TL;DR

This work proposes differential private overlapping grouped learning (DP-OGL), a novel method to implement privacy guarantees within overlapping groups and derives novel privacy guarantees between arbitrary pairs of workers under the honest-but-curious threat model.

Abstract

Federated Learning (FL) is the standard protocol for collaborative learning. In FL, multiple workers jointly train a shared model. They exchange model updates calculated on their data, while keeping the raw data itself local. Since workers naturally form groups based on common interests and privacy policies, we are motivated to extend standard FL to reflect a setting with multiple, potentially overlapping groups. In this setup where workers can belong and contribute to more than one group at a time, complexities arise in understanding privacy leakage and in adhering to privacy policies. To address the challenges, we propose differential private overlapping grouped learning (DPOGL), a novel method to implement privacy guarantees within overlapping groups. Under the honest-but-curious threat model, we derive novel privacy guarantees between arbitrary pairs of workers. These privacy guarantees describe and quantify two key effects of privacy leakage in DP-OGL: propagation delay, i.e., the fact that information from one group will leak to other groups only with temporal offset through the common workers and information degradation, i.e., the fact that noise addition over model updates limits information leakage between workers. Our experiments show that applying DP-OGL enhances utility while maintaining strong privacy compared to standard FL setups.

Paper Structure

This paper contains 41 sections, 15 theorems, 89 equations, 5 figures, 1 table, 2 algorithms.

Key Result

Theorem 1

Setting $\epsilon_{m'}=\frac{2\pi_{m'}^2\alpha}{\sigma_{m'}^2}$, targeting worker $n\in \mathcal{N}$, and running for $t$ epochs, our algorithms achieve $(\alpha,\epsilon_n^{1:t})$-PwP with $\epsilon_n^{1:t}$ defined as (eq:worstcase). Under Threat Model threat1, in the DP-OGL algorithm $\epsilon_{n Under Threat Model threat2, in the DP-OGL+ algorithm, for every $i\in \mathcal{N}\backslash \hat{\m

Figures (5)

  • Figure 1: Seven workers collaborate in (a) one, (b) five groups. Groups are shown as colored circles, workers as humanoid shapes, and masters as antenna symbols. Workers of the same color share interests and privacy constraints.
  • Figure 2: Group structure involves $N=3$ workers, collaborating in (a) $M=1$ group (considered as baseline), (b) $M=2$ overlapping groups (string structure). Learning occurs across epochs $\tau \in [3]$, with the results of learned models distributed to workers in epoch 4, and collaborative learning structured in (c) a single group, and (d) overlapping groups.
  • Figure 3: We plot the average training loss vs. epoch in (a), (c), (e), and (g), and the average test accuracy vs. epoch in (b), (d), (f), and (h). In legends, "LB", "CL", and "RI" stand for label-based, clustered, and ring group structures. In (a) and (b), the legends also include four parameters: the number of workers $N$, the number of groups $M$, the value of $S$, and the noise multiplier $\sigma_m$, listed left to right. In (c), (d), (g), and (h), both DP-OGL and DP-OGL+ adopt an RI structure, with $S=10$ for DP-OGL and $S=2$ for DP-OGL+. In (c) and (d), $\sigma_m=2$ is fixed across all curves, except for DP-OGL+ with $N=40$, where $\sigma_m=1.3$. In (g) and (h), $N/M=10$ is fixed. All subfigures maintain $c_m=0.05, \pi_m=0.7,$ and $L=10$. The baseline corresponds to the GL structure with $M=1$ and $S=1$.
  • Figure 4: PwP bounds are displayed in (a). Heatmaps are displayed for $\epsilon_{n,i}^{1:200}$ across $n,i\in [100]$ under group structures: (b) fully global (GL) with $(M,S)=(1,1)$, (c) clustered (CL) with $(M,S)=(4,2)$, (d) label-based (LB) with $(M,S)=(5,10)$, (e) ring (RI) with $(M,S)=(4,10)$, (f) RI with $(M,S)=(4,25)$, (g) RI with $(M,S)=(10,10)$, and (h) RI with $(M,S)=(4,10)$. In (a)-(g), DP-OGL is run. In (h), DP-OGL+ is run. In heatmaps, darker colors (near 0 in (b)-(g) and near 4 in (h)) indicate lower privacy leakage. Lighter colors (near to 30 in (b)-(g) and near to 5 in (h)) indicate higher leakage. Omitted (white) cells correspond to the pairs of workers with mutual trust.
  • Figure 5: In legends, "LB", "CL", and "RI" stand for label-based, clustered, and ring group structures. The legends also include four parameters: the number of workers $N$, the number of groups $M$, the value of $S$, and the noise multiplier $\sigma_m$, listed left to right.

Theorems & Definitions (31)

  • Definition 1: Threat Model 1 (TM 1)
  • Definition 2: Threat Model 2 (TM 2)
  • Definition 3
  • Definition 4: Per-Worker Privacy (PwP)
  • Definition 5
  • Definition 6: Combined mechanism
  • Theorem 1
  • Lemma 1
  • Definition 7: String
  • Theorem 2
  • ...and 21 more