Table of Contents
Fetching ...

Federated Unlearning Made Practical: Seamless Integration via Negated Pseudo-Gradients

Alessio Mora, Carlo Mazzocca, Rebecca Montanari, Paolo Bellavista

TL;DR

This work tackles federated unlearning by introducing PUF, a method that treats client updates as pseudo-gradients and negates them to erase a target client’s influence without storing historical data or altering the standard FedAvg workflow. PUF operates in two modes—PUF-Regular and PUF-Special—offering seamless integration with regular FL rounds and concurrent unlearning requests, while maintaining task-agnosticity. Extensive experiments on CIFAR-10, CIFAR-100, and ProstateMRI demonstrate that PUF achieves superior forgetting effectiveness with substantially reduced communication, computation, and storage costs compared to state-of-the-art baselines. The approach preserves model utility after recovery and requires only minimal hyperparameter tuning, making it practical for real-world FL deployments. Overall, PUF provides a principled, scalable solution to the right to be forgotten in privacy-preserving machine learning contexts.

Abstract

The right to be forgotten is a fundamental principle of privacy-preserving regulations and extends to Machine Learning (ML) paradigms such as Federated Learning (FL). While FL enhances privacy by enabling collaborative model training without sharing private data, trained models still retain the influence of training data. Federated Unlearning (FU) methods recently proposed often rely on impractical assumptions for real-world FL deployments, such as storing client update histories or requiring access to a publicly available dataset. To address these constraints, this paper introduces a novel method that leverages negated Pseudo-gradients Updates for Federated Unlearning (PUF). Our approach only uses standard client model updates, which are employed during regular FL rounds, and interprets them as pseudo-gradients. When a client needs to be forgotten, we apply the negation of their pseudo-gradients, appropriately scaled, to the global model. Unlike state-of-the-art mechanisms, PUF seamlessly integrates with FL workflows, incurs no additional computational and communication overhead beyond standard FL rounds, and supports concurrent unlearning requests. We extensively evaluated the proposed method on two well-known benchmark image classification datasets (CIFAR-10 and CIFAR-100) and a real-world medical imaging dataset for segmentation (ProstateMRI), using three different neural architectures: two residual networks and a vision transformer. The experimental results across various settings demonstrate that PUF achieves state-of-the-art forgetting effectiveness and recovery time, without relying on any additional assumptions.

Federated Unlearning Made Practical: Seamless Integration via Negated Pseudo-Gradients

TL;DR

This work tackles federated unlearning by introducing PUF, a method that treats client updates as pseudo-gradients and negates them to erase a target client’s influence without storing historical data or altering the standard FedAvg workflow. PUF operates in two modes—PUF-Regular and PUF-Special—offering seamless integration with regular FL rounds and concurrent unlearning requests, while maintaining task-agnosticity. Extensive experiments on CIFAR-10, CIFAR-100, and ProstateMRI demonstrate that PUF achieves superior forgetting effectiveness with substantially reduced communication, computation, and storage costs compared to state-of-the-art baselines. The approach preserves model utility after recovery and requires only minimal hyperparameter tuning, making it practical for real-world FL deployments. Overall, PUF provides a principled, scalable solution to the right to be forgotten in privacy-preserving machine learning contexts.

Abstract

The right to be forgotten is a fundamental principle of privacy-preserving regulations and extends to Machine Learning (ML) paradigms such as Federated Learning (FL). While FL enhances privacy by enabling collaborative model training without sharing private data, trained models still retain the influence of training data. Federated Unlearning (FU) methods recently proposed often rely on impractical assumptions for real-world FL deployments, such as storing client update histories or requiring access to a publicly available dataset. To address these constraints, this paper introduces a novel method that leverages negated Pseudo-gradients Updates for Federated Unlearning (PUF). Our approach only uses standard client model updates, which are employed during regular FL rounds, and interprets them as pseudo-gradients. When a client needs to be forgotten, we apply the negation of their pseudo-gradients, appropriately scaled, to the global model. Unlike state-of-the-art mechanisms, PUF seamlessly integrates with FL workflows, incurs no additional computational and communication overhead beyond standard FL rounds, and supports concurrent unlearning requests. We extensively evaluated the proposed method on two well-known benchmark image classification datasets (CIFAR-10 and CIFAR-100) and a real-world medical imaging dataset for segmentation (ProstateMRI), using three different neural architectures: two residual networks and a vision transformer. The experimental results across various settings demonstrate that PUF achieves state-of-the-art forgetting effectiveness and recovery time, without relying on any additional assumptions.

Paper Structure

This paper contains 22 sections, 11 equations, 6 figures, 8 tables, 1 algorithm.

Figures (6)

  • Figure 1: Performance comparison of PUF against state-of-the-art mechanisms using ResNet-18 on CIFAR-100 (non-IID case) in a 10-client setup, where one client requests unlearning. The ideal FU algorithm should (1) minimize the difference with the gold standard retrained model, across forgetting metrics such as Forget Accuracy and various MIAs (in Figure expressed as $\Delta$ Forget Accuracy and $\Delta$ MIAs), (2) minimize computational and communication cost to recover model utility. A smaller polygon represents better unlearning performances. Experimental details are available in Section \ref{['sec:experimental_results']}, with full results across settings reported in Table \ref{['table:full_results']}.
  • Figure 2: Overview of PUF operating modes. In PUF-Regular Round, target clients provide their model updates together with other clients. In PUF-Special Round, only target clients share their model updates.
  • Figure 3: Label distribution across clients (0-9) for CIFAR-100 (IID), CIFAR-10 (Non-IID, $\alpha=0.3$), and CIFAR-100 (Non-IID$, \alpha=0.1$).
  • Figure 4: Evolution of generalization ability (test accuracy) and forgetting effectiveness (forget accuracy) during the recovery phase across three different settings. Each pair of images presents Test Accuracy (Left) and Forget Accuracy (Right) for a representative client in a specific setting indicated in subcaption as a triple dataset, data distribution, model architecture. For our method, the charts only report the performance of PUF-Special for better visualization. For MoDe we do not show the multi-round unlearning phase for clarity.
  • Figure 5: Performance of PUF with varying hyper-parameters ($\eta_u$, $E_u$). When $E_u$ is omitted, it is set to 1. X-axis: number of recovery rounds required to match the retrained model's test accuracy. Y-axis: gap in forget accuracy compared to the retrained model. Points closer to the origin indicate better performance. The experimental setting is indicated in subcaption as a triple dataset, data distribution, model architecture. The Natural baseline reports results for a naive strategy of fine tuning the global model without the participation of the unlearning client, no explicit unlearning is performed.
  • ...and 1 more figures