Table of Contents
Fetching ...

SIFU: Sequential Informed Federated Unlearning for Efficient and Provable Client Unlearning in Federated Optimization

Yann Fraboni, Martin Van Waerebeke, Kevin Scaman, Richard Vidal, Laetitia Kameni, Marco Lorenzi

TL;DR

The paper tackles the critical problem of unlearning in federated learning by providing formal, scalable guarantees for removing a client's contribution from models trained with FedAvg. It introduces SIFU, a Sequential Informed Federated Unlearning framework that builds on Informed FU (IFU) by handling sequences of unlearning requests while preserving convergence. The approach derives a computable sensitivity bound to certify unlearning via client-specific Gaussian perturbations and retraining, applicable to both convex and non-convex FL settings. Empirical results across multiple datasets show that SIFU achieves superior forgetting performance with favorable utility on retained data, outperforming several baselines and enabling watermark-based verification of unlearning effectiveness.

Abstract

Machine Unlearning (MU) is an increasingly important topic in machine learning safety, aiming at removing the contribution of a given data point from a training procedure. Federated Unlearning (FU) consists in extending MU to unlearn a given client's contribution from a federated training routine. While several FU methods have been proposed, we currently lack a general approach providing formal unlearning guarantees to the FedAvg routine, while ensuring scalability and generalization beyond the convex assumption on the clients' loss functions. We aim at filling this gap by proposing SIFU (Sequential Informed Federated Unlearning), a new FU method applying to both convex and non-convex optimization regimes. SIFU naturally applies to FedAvg without additional computational cost for the clients and provides formal guarantees on the quality of the unlearning task. We provide a theoretical analysis of the unlearning properties of SIFU, and practically demonstrate its effectiveness as compared to a panel of unlearning methods from the state-of-the-art.

SIFU: Sequential Informed Federated Unlearning for Efficient and Provable Client Unlearning in Federated Optimization

TL;DR

The paper tackles the critical problem of unlearning in federated learning by providing formal, scalable guarantees for removing a client's contribution from models trained with FedAvg. It introduces SIFU, a Sequential Informed Federated Unlearning framework that builds on Informed FU (IFU) by handling sequences of unlearning requests while preserving convergence. The approach derives a computable sensitivity bound to certify unlearning via client-specific Gaussian perturbations and retraining, applicable to both convex and non-convex FL settings. Empirical results across multiple datasets show that SIFU achieves superior forgetting performance with favorable utility on retained data, outperforming several baselines and enabling watermark-based verification of unlearning effectiveness.

Abstract

Machine Unlearning (MU) is an increasingly important topic in machine learning safety, aiming at removing the contribution of a given data point from a training procedure. Federated Unlearning (FU) consists in extending MU to unlearn a given client's contribution from a federated training routine. While several FU methods have been proposed, we currently lack a general approach providing formal unlearning guarantees to the FedAvg routine, while ensuring scalability and generalization beyond the convex assumption on the clients' loss functions. We aim at filling this gap by proposing SIFU (Sequential Informed Federated Unlearning), a new FU method applying to both convex and non-convex optimization regimes. SIFU naturally applies to FedAvg without additional computational cost for the clients and provides formal guarantees on the quality of the unlearning task. We provide a theoretical analysis of the unlearning properties of SIFU, and practically demonstrate its effectiveness as compared to a panel of unlearning methods from the state-of-the-art.
Paper Structure (31 sections, 3 theorems, 40 equations, 7 figures, 2 tables, 3 algorithms)

This paper contains 31 sections, 3 theorems, 40 equations, 7 figures, 2 tables, 3 algorithms.

Key Result

Theorem 1

For smooth client's local loss functions (i.e. with Lipschitz-continuous gradients), we have with the bounded sensitivity $\Psi$ defined as: where $\eta$ is the learning rate, $\gamma_{s,n} = (n-s-1)K$, and $B(f_I, \eta )<1$, $B(f_I, \eta )=1$ or $B(f_I, \eta )>1$ if the clients' loss functions are smooth and, respectively, strongly convex, convex, or non-convex. The exact formula for $B(f_I, \e

Figures (7)

  • Figure 1: Illustration of SIFU (Algorithm \ref{['alg:SIFU']}) when the server receives $U=3$ unlearning requests, through the evolution of the global model parameters ${\bm{\theta}}_u^n$ after server aggregation and noise perturbation. After standard federated training via FedAvg$(I, N_0)$ the training history is $H(0) = ({\bm{\theta}}_0^0, \ldots, {\bm{\theta}}_0^{N_0})$. At request $u=1$, the unlearning index is $T_1$, and the training history becomes $H(1) = ({\bm{\theta}}_0^0, \ldots, {\bm{\theta}}_0^{T_1}, {\bm{\theta}}_1^0, \ldots, {\bm{\theta}}_1^{N_1})$ with $\zeta_1 = 0$. At request $u=2$, the unlearning index is $T_2$ and the training history becomes $H(2) =({\bm{\theta}}_0^0, \ldots, {\bm{\theta}}_0^{T_1}, {\bm{\theta}}_1^0, \ldots, {\bm{\theta}}_1^{T_2}, {\bm{\theta}}_2^{0}, \ldots {\bm{\theta}}_2^{N_2})$ with $\zeta_2 = 1$. Finally, at request $u=3$, the unlearning index is found at $T_3<T_2$ in the branch of request $u=1$. The updated training history is now $H(3) =({\bm{\theta}}_0^0, \ldots, {\bm{\theta}}_0^{T_1}, {\bm{\theta}}_1^0, \ldots, {\bm{\theta}}_1^{T_3}, {\bm{\theta}}_3^{0}, \ldots {\bm{\theta}}_3^{N_3})$ with $\zeta_3 = 1$.
  • Figure 2: Difference in accuracy (absolute value) between Scratch and the considered unlearning methods, on both retain and forget sets (lower is better).
  • Figure 3: Total amount of aggregation rounds (1st row) and model accuracy of unlearned clients (2nd row) for the unlearning of watermarked data from MNIST, FashionMNIST, CIFAR10, CIFAR100, and CelebA (the lower the better).
  • Figure 4: Total amount of aggregation rounds (1st row) and model accuracy of unlearned clients (2nd row) for MNIST, FashionMNIST, CIFAR10, CIFAR100, and CelebA (the lower the better). The server runs a federated routine with $M=100$ clients, and unlearns 10 of them at each unlearning request ($U=3$). Results are reported with variability estimated on 10 seeds.
  • Figure 5: Impact of the noise standard deviation $\sigma$ when unlearning with SIFU for the unlearning budget $(\epsilon, \delta) = (10, 0.01)$. Total amount of aggregation rounds (1st row) and model accuracy of unlearned clients (2nd row) for MNIST, FashionMNIST, CIFAR10, CIFAR100, and CelebA (the lower the better). Speed-ups at optimal sigma are between two-fold and five-fold.
  • ...and 2 more figures

Theorems & Definitions (9)

  • Definition 1
  • Theorem 1
  • proof
  • Theorem 2
  • proof
  • Theorem 3
  • proof
  • proof
  • proof