Table of Contents
Fetching ...

Federated Unlearning: How to Efficiently Erase a Client in FL?

Anisa Halimi, Swanand Kadhe, Ambrish Rawat, Nathalie Baracaldo

TL;DR

The paper addresses removing a single client’s entire data influence in federated learning by first performing constrained local unlearning at the target client via projected gradient descent around a reference model, then continuing with a few rounds of federated learning starting from this unlearned model. This two-phase approach achieves comparable results to retraining from scratch while significantly reducing communication and computation costs, without requiring global data access or stored update histories. Empirical evaluations on backdoors and flipped-image scenarios across MNIST, EMNIST, and CIFAR-10 demonstrate strong efficacy, fidelity, and substantial efficiency gains, highlighting practical applicability in cross-silo FL. The method’s key novelty lies in formulating local unlearning as a constrained optimization around a reference model and leveraging minimal FL rounds to recover a robust, unlearned global model.

Abstract

With privacy legislation empowering the users with the right to be forgotten, it has become essential to make a model amenable for forgetting some of its training data. However, existing unlearning methods in the machine learning context can not be directly applied in the context of distributed settings like federated learning due to the differences in learning protocol and the presence of multiple actors. In this paper, we tackle the problem of federated unlearning for the case of erasing a client by removing the influence of their entire local data from the trained global model. To erase a client, we propose to first perform local unlearning at the client to be erased, and then use the locally unlearned model as the initialization to run very few rounds of federated learning between the server and the remaining clients to obtain the unlearned global model. We empirically evaluate our unlearning method by employing multiple performance measures on three datasets, and demonstrate that our unlearning method achieves comparable performance as the gold standard unlearning method of federated retraining from scratch, while being significantly efficient. Unlike prior works, our unlearning method neither requires global access to the data used for training nor the history of the parameter updates to be stored by the server or any of the clients.

Federated Unlearning: How to Efficiently Erase a Client in FL?

TL;DR

The paper addresses removing a single client’s entire data influence in federated learning by first performing constrained local unlearning at the target client via projected gradient descent around a reference model, then continuing with a few rounds of federated learning starting from this unlearned model. This two-phase approach achieves comparable results to retraining from scratch while significantly reducing communication and computation costs, without requiring global data access or stored update histories. Empirical evaluations on backdoors and flipped-image scenarios across MNIST, EMNIST, and CIFAR-10 demonstrate strong efficacy, fidelity, and substantial efficiency gains, highlighting practical applicability in cross-silo FL. The method’s key novelty lies in formulating local unlearning as a constrained optimization around a reference model and leveraging minimal FL rounds to recover a robust, unlearned global model.

Abstract

With privacy legislation empowering the users with the right to be forgotten, it has become essential to make a model amenable for forgetting some of its training data. However, existing unlearning methods in the machine learning context can not be directly applied in the context of distributed settings like federated learning due to the differences in learning protocol and the presence of multiple actors. In this paper, we tackle the problem of federated unlearning for the case of erasing a client by removing the influence of their entire local data from the trained global model. To erase a client, we propose to first perform local unlearning at the client to be erased, and then use the locally unlearned model as the initialization to run very few rounds of federated learning between the server and the remaining clients to obtain the unlearned global model. We empirically evaluate our unlearning method by employing multiple performance measures on three datasets, and demonstrate that our unlearning method achieves comparable performance as the gold standard unlearning method of federated retraining from scratch, while being significantly efficient. Unlike prior works, our unlearning method neither requires global access to the data used for training nor the history of the parameter updates to be stored by the server or any of the clients.
Paper Structure (16 sections, 3 equations, 13 figures, 1 algorithm)

This paper contains 16 sections, 3 equations, 13 figures, 1 algorithm.

Figures (13)

  • Figure 1: Phases of Federated Unlearning: (a) First, clients and the server participate in a federated learning process to train a global model. (b) One of the clients wants to opt out of the federation, and wants to unlearn their data. The target client $i$ locally runs Projected Gradient Descent (Algorithm \ref{['alg:PGA']}) to obtain model $\mathbf{w}^u_i$. (c) The server and the remaining clients perform a few steps of federated learning with $\mathbf{w}^{u}_i$ as the initial point to obtain the final 'unlearned' model (Algorithm \ref{['alg:PGA']}).
  • Figure 2: Backdoor accuracy (efficacy) of the fully retrained and the PGD-based unlearned model in each dataset, and their comparison with the FedAvg model before unlearning. The backdoor accuracy of the PGD-based unlearned model is obtained after $1$ round of FL post-training. Our method significantly reduces the backdoor accuracy compared to FedAvg model and achieves a similar performance as retraining, which demonstrates its high unlearning efficacy.
  • Figure 3: Backdoor Scenario: Membership inference attacks accuracy (efficacy) for the two attacks and the three datasets for $N=5$ clients. Our proposed method achieves a similar attack accuracy as retraining, which demonstrates its high efficacy.
  • Figure 4: Backdoor Scenario: Clean accuracy of the fully retrained and the PGD-based unlearned model in each dataset. The clean accuracy of the PGD-based unlearned model is obtained after $5$ rounds of FL post-training. Our unlearning method achieves similar clean accuracy to retraining, which demonstrates its high fidelity.
  • Figure 5: Backdoor Scenario: Communication costs (efficiency) of the proposed unlearning method and the baseline approach of retraining with respect to the clean accuracy (fidelity) in each dataset for $N=5$ clients.
  • ...and 8 more figures