NoT: Federated Unlearning via Weight Negation
Yasser H. Khalil, Leo Brunswic, Soufiane Lamghari, Xu Li, Mahdi Beitollahi, Xi Chen
TL;DR
NoT introduces federated unlearning via weight negation, a storage-free perturbation that disrupts inter-layer co-adaptation and enables rapid recovery through fine-tuning on retained data. The authors provide a theoretical framework linking strong perturbations, Jacobian control, and layer-wise optimality to a practical two-step FU process: negate selected layer weights to create a perturbed model, then fine-tune on retained data to forget target data without access to it. Empirically, NoT outperforms seven FU baselines across CIFAR-10/100 and Caltech-101 with CNN, ResNet-18, and ViT architectures, including backdoor mitigation and centralized settings, while incurring minimal communication and computation costs. The work further substantiates its claims with extensive ablations, CKA analyses, and theoretical results on unlearning time, activation distance, and Jacobian behavior, highlighting NoT’s robustness, efficiency, and practical applicability for privacy-preserving learning. Overall, NoT offers a principled, scalable solution for FU that does not require the remembered data and significantly reduces the overhead associated with data forgetting in federated environments.
Abstract
Federated unlearning (FU) aims to remove a participant's data contributions from a trained federated learning (FL) model, ensuring privacy and regulatory compliance. Traditional FU methods often depend on auxiliary storage on either the client or server side or require direct access to the data targeted for removal-a dependency that may not be feasible if the data is no longer available. To overcome these limitations, we propose NoT, a novel and efficient FU algorithm based on weight negation (multiplying by -1), which circumvents the need for additional storage and access to the target data. We argue that effective and efficient unlearning can be achieved by perturbing model parameters away from the set of optimal parameters, yet being well-positioned for quick re-optimization. This technique, though seemingly contradictory, is theoretically grounded: we prove that the weight negation perturbation effectively disrupts inter-layer co-adaptation, inducing unlearning while preserving an approximate optimality property, thereby enabling rapid recovery. Experimental results across three datasets and three model architectures demonstrate that NoT significantly outperforms existing baselines in unlearning efficacy as well as in communication and computational efficiency.
