Guaranteeing Data Privacy in Federated Unlearning with Dynamic User Participation

Ziyao Liu; Yu Jiang; Weifeng Jiang; Jiale Guo; Jun Zhao; Kwok-Yan Lam

Guaranteeing Data Privacy in Federated Unlearning with Dynamic User Participation

Ziyao Liu, Yu Jiang, Weifeng Jiang, Jiale Guo, Jun Zhao, Kwok-Yan Lam

TL;DR

This work systematically explores the integration of SecAgg protocols within the most widely used federated unlearning scheme, which is based on clustering, to establish a privacy-preserving FU framework, aimed at ensuring privacy while effectively managing dynamic user participation.

Abstract

Federated Unlearning (FU) is gaining prominence for its capability to eliminate influences of Federated Learning (FL) users' data from trained global FL models. A straightforward FU method involves removing the unlearned users and subsequently retraining a new global FL model from scratch with all remaining users, a process that leads to considerable overhead. To enhance unlearning efficiency, a widely adopted strategy employs clustering, dividing FL users into clusters, with each cluster maintaining its own FL model. The final inference is then determined by aggregating the majority vote from the inferences of these sub-models. This method confines unlearning processes to individual clusters for removing a user, thereby enhancing unlearning efficiency by eliminating the need for participation from all remaining users. However, current clustering-based FU schemes mainly concentrate on refining clustering to boost unlearning efficiency but overlook the potential information leakage from FL users' gradients, a privacy concern that has been extensively studied. Typically, integrating secure aggregation (SecAgg) schemes within each cluster can facilitate a privacy-preserving FU. Nevertheless, crafting a clustering methodology that seamlessly incorporates SecAgg schemes is challenging, particularly in scenarios involving adversarial users and dynamic users. In this connection, we systematically explore the integration of SecAgg protocols within the most widely used federated unlearning scheme, which is based on clustering, to establish a privacy-preserving FU framework, aimed at ensuring privacy while effectively managing dynamic user participation. Comprehensive theoretical assessments and experimental results show that our proposed scheme achieves comparable unlearning effectiveness, alongside offering improved privacy protection and resilience in the face of varying user participation.

Guaranteeing Data Privacy in Federated Unlearning with Dynamic User Participation

TL;DR

Abstract

Paper Structure (20 sections, 6 theorems, 22 equations, 11 figures, 1 table, 2 algorithms)

This paper contains 20 sections, 6 theorems, 22 equations, 11 figures, 1 table, 2 algorithms.

Introduction
Preliminaries and Notations
Federated Learning & Unlearning
Secure Aggregation
Hypergeometric distribution
$m$-regular graph
FL model convergence
Notations
Requirements on Clustering
Security Requirements on Clustering
User Clustering
Handling Unlearning Requests
Sequential unlearning
Batch unlearning
Putting it all together
...and 5 more sections

Key Result

Lemma 1

(Shamir security) For a user $u_i \in U$, its cluster $c(u_i)$ consists of $k$ users. Given a parameter $\xi = t/k$ such that $\xi > \gamma$, if the set of adversarial users $A \subseteq U$ satisfies $|A| \leq \gamma N$ and the cluster size $k$ fulfills the following inequality (equ:shamir_security)

Figures (11)

Figure 1: An illustrative example of clustering-based FU adapted from bourtoule2021machine. FL users are divided into clusters. During the inference phase, test data is input into the model of each cluster, and the inferences from all clusters are aggregated to produce the final results based on majority voting. If an unlearning request is initiated to remove the target user $u_t$ in the cluster $c_2$, only the remaining users in cluster $c_2$ need to conduct the unlearning process, i.e., retraining from scratch following an FL style within the cluster. Users in other clusters can continue their FL training process or conduct the unlearning process according to the unlearning requests initiated in their respective clusters.
Figure 2: The sparse communication graph of SecAgg+ bell2020secure compared to the complete communication graph of SecAgg bonawitz2017practical. We can observe that each user in (b) communicates with much fewer users than those in (a).
Figure 3: An illustrative example of graph connectivity where the edge represents that the two users are involved in pairwise masking. In (b), $u_2, u_4, u_6$ are dropout users, $u_8$ is an unlearned user, and in (c) $u_4, u_6$ are dropout users, $u_8$ is an unlearned user, while $u_2$ is an adversarial user.
Figure 4: Comparison of the required cluster size $k$ across different parameter settings, considering (a) security and correctness parameters $\sigma$ and $\eta$ where the parameters $\{\gamma, \delta,\zeta, \xi\}$ are set to be $\{0.2,0.2,0.1,0.7\}$, respectively; (b) the fraction of adversarial and dropout users $\gamma$ and $\delta$ where the parameters $\{\sigma, \eta,\zeta, \xi\}$ are set to be $\{40,40,0.1,0.7\}$, respectively; and (c) the fraction of unlearned users $\zeta$ within a cluster where the parameters $\{\sigma, \eta, \gamma, \delta,\zeta, \xi\}$ are set to be $\{40, 40, 0.1,0.1,0.1,0.7\}$, respectively.
Figure 5: Comparison of the unlearning capacities $\tau_{seq}$ for sequential unlearning and $\tau_{bat}$ for batch unlearning under various parameter settings.
...and 6 more figures

Theorems & Definitions (21)

Lemma 1
proof
Lemma 2
proof
Lemma 3
proof
Lemma 4.1
proof
Lemma 4.2
proof
...and 11 more

Guaranteeing Data Privacy in Federated Unlearning with Dynamic User Participation

TL;DR

Abstract

Guaranteeing Data Privacy in Federated Unlearning with Dynamic User Participation

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (11)

Theorems & Definitions (21)