Table of Contents
Fetching ...

Communication Efficient and Provable Federated Unlearning

Youming Tao, Cheng-Long Wang, Miao Pan, Dongxiao Yu, Xiuzhen Cheng, Di Wang

TL;DR

This work addresses the problem of exact federated unlearning, ensuring that removing a data point or client yields a model statistically indistinguishable from one trained without them. It introduces TV-stability as a lever to achieve exact unlearning with communication efficiency, and proposes FATS, a TV-stable FL algorithm with local SGD and strategic sub-sampling. Two unlearning procedures, FATS-SU and FATS-CU, provide provable, transport-based guarantees that the unlearned model matches the distribution of the model trained on updated data, with recomputation probabilities bounded by $\rho_S w$ and $\rho_C w$. The authors establish convergence guarantees, analyze time/space overheads, and validate the framework on six benchmarks, showing competitive learning performance and superior unlearning efficiency and privacy protection relative to state-of-the-art baselines.

Abstract

We study federated unlearning, a novel problem to eliminate the impact of specific clients or data points on the global model learned via federated learning (FL). This problem is driven by the right to be forgotten and the privacy challenges in FL. We introduce a new framework for exact federated unlearning that meets two essential criteria: \textit{communication efficiency} and \textit{exact unlearning provability}. To our knowledge, this is the first work to tackle both aspects coherently. We start by giving a rigorous definition of \textit{exact} federated unlearning, which guarantees that the unlearned model is statistically indistinguishable from the one trained without the deleted data. We then pinpoint the key property that enables fast exact federated unlearning: total variation (TV) stability, which measures the sensitivity of the model parameters to slight changes in the dataset. Leveraging this insight, we develop a TV-stable FL algorithm called \texttt{FATS}, which modifies the classical \texttt{\underline{F}ed\underline{A}vg} algorithm for \underline{T}V \underline{S}tability and employs local SGD with periodic averaging to lower the communication round. We also design efficient unlearning algorithms for \texttt{FATS} under two settings: client-level and sample-level unlearning. We provide theoretical guarantees for our learning and unlearning algorithms, proving that they achieve exact federated unlearning with reasonable convergence rates for both the original and unlearned models. We empirically validate our framework on 6 benchmark datasets, and show its superiority over state-of-the-art methods in terms of accuracy, communication cost, computation cost, and unlearning efficacy.

Communication Efficient and Provable Federated Unlearning

TL;DR

This work addresses the problem of exact federated unlearning, ensuring that removing a data point or client yields a model statistically indistinguishable from one trained without them. It introduces TV-stability as a lever to achieve exact unlearning with communication efficiency, and proposes FATS, a TV-stable FL algorithm with local SGD and strategic sub-sampling. Two unlearning procedures, FATS-SU and FATS-CU, provide provable, transport-based guarantees that the unlearned model matches the distribution of the model trained on updated data, with recomputation probabilities bounded by and . The authors establish convergence guarantees, analyze time/space overheads, and validate the framework on six benchmarks, showing competitive learning performance and superior unlearning efficiency and privacy protection relative to state-of-the-art baselines.

Abstract

We study federated unlearning, a novel problem to eliminate the impact of specific clients or data points on the global model learned via federated learning (FL). This problem is driven by the right to be forgotten and the privacy challenges in FL. We introduce a new framework for exact federated unlearning that meets two essential criteria: \textit{communication efficiency} and \textit{exact unlearning provability}. To our knowledge, this is the first work to tackle both aspects coherently. We start by giving a rigorous definition of \textit{exact} federated unlearning, which guarantees that the unlearned model is statistically indistinguishable from the one trained without the deleted data. We then pinpoint the key property that enables fast exact federated unlearning: total variation (TV) stability, which measures the sensitivity of the model parameters to slight changes in the dataset. Leveraging this insight, we develop a TV-stable FL algorithm called \texttt{FATS}, which modifies the classical \texttt{\underline{F}ed\underline{A}vg} algorithm for \underline{T}V \underline{S}tability and employs local SGD with periodic averaging to lower the communication round. We also design efficient unlearning algorithms for \texttt{FATS} under two settings: client-level and sample-level unlearning. We provide theoretical guarantees for our learning and unlearning algorithms, proving that they achieve exact federated unlearning with reasonable convergence rates for both the original and unlearned models. We empirically validate our framework on 6 benchmark datasets, and show its superiority over state-of-the-art methods in terms of accuracy, communication cost, computation cost, and unlearning efficacy.
Paper Structure (49 sections, 6 theorems, 12 equations, 8 figures, 2 tables, 3 algorithms)

This paper contains 49 sections, 6 theorems, 12 equations, 8 figures, 2 tables, 3 algorithms.

Key Result

Lemma 1

For any given $\rho_S, \rho_C\in(0,1]$, FATS is $\min\{\rho_S, 1\}$ sample-level TV-stable and $\min\{\rho_C, 1\}$ client-level TV-stable.

Figures (8)

  • Figure 1: Comparison of the test accuracy of different methods and their changes after unlearning on Cifar-100, FEMNIST, and Shakespeare. Top: sample-level unlearning. Bottom: client-level unlearning.
  • Figure 2: Unlearning Efficiency of FATS compared with FRS.
  • Figure 3: Impacts of the number of unlearning requests on unlearning efficiency.
  • Figure 4: Impacts of stability parameters on learning utility and unlearning efficiency.
  • Figure 5: Comparison of the test accuracy of different methods and their changes after conducting unlearning on MNIST, Fashion-MNIST, and Cifar-10. Top: sample-level unlearning. Bottom: client-level unlearning.
  • ...and 3 more figures

Theorems & Definitions (15)

  • Definition 1: Sample-level Exact Federated Unlearning
  • Definition 2: Client-level Exact Federated Unlearning
  • Definition 3: $\rho_S$-sample-level TV-stability
  • Definition 4: $\rho_C$-client-level TV-stability
  • Remark 1
  • Lemma 1
  • Theorem 1
  • Definition 5: Gradient Dissimilarity
  • Lemma 2
  • Theorem 2
  • ...and 5 more