Federated Unlearning
Gaoyang Liu, Xiaoqiang Ma, Yang Yang, Chen Wang, Jiangchuan Liu
TL;DR
This work tackles the challenge of data forgetting in federated learning by introducing FedEraser, a non-intrusive unlearning method that reconstructs an unlearned global model from history-retained updates with a lightweight calibration step. By trading server storage for faster unlearning, FedEraser achieves around 4x speed-ups over retraining from scratch while maintaining model utility close to baseline FL and retraining. The approach is validated on four realistic datasets, and privacy assessments using membership inference attacks show comparable leakage to retraining, indicating effective forgetting with limited additional risk. Overall, FedEraser represents an early, practical step toward compliant and transparent federated learning by enabling efficient, verifiable data removal without requiring client data access.
Abstract
Federated learning (FL) has recently emerged as a promising distributed machine learning (ML) paradigm. Practical needs of the "right to be forgotten" and countering data poisoning attacks call for efficient techniques that can remove, or unlearn, specific training data from the trained FL model. Existing unlearning techniques in the context of ML, however, are no longer in effect for FL, mainly due to the inherent distinction in the way how FL and ML learn from data. Therefore, how to enable efficient data removal from FL models remains largely under-explored. In this paper, we take the first step to fill this gap by presenting FedEraser, the first federated unlearning methodology that can eliminate the influence of a federated client's data on the global FL model while significantly reducing the time used for constructing the unlearned FL model.The basic idea of FedEraser is to trade the central server's storage for unlearned model's construction time, where FedEraser reconstructs the unlearned model by leveraging the historical parameter updates of federated clients that have been retained at the central server during the training process of FL. A novel calibration method is further developed to calibrate the retained updates, which are further used to promptly construct the unlearned model, yielding a significant speed-up to the reconstruction of the unlearned model while maintaining the model efficacy. Experiments on four realistic datasets demonstrate the effectiveness of FedEraser, with an expected speed-up of $4\times$ compared with retraining from the scratch. We envision our work as an early step in FL towards compliance with legal and ethical criteria in a fair and transparent manner.
