Federated Unlearning with Gradient Descent and Conflict Mitigation

Zibin Pan; Zhichao Wang; Chi Li; Kaiyan Zheng; Boqi Wang; Xiaoying Tang; Junhua Zhao

Federated Unlearning with Gradient Descent and Conflict Mitigation

Zibin Pan, Zhichao Wang, Chi Li, Kaiyan Zheng, Boqi Wang, Xiaoying Tang, Junhua Zhao

TL;DR

The paper tackles the privacy challenge of the Right to be Forgotten in Federated Learning by addressing weaknesses of gradient-ascent unlearning, such as gradient explosion, utility loss, and post-unlearning reversion. It introduces FedOSD, which uses Unlearning Cross-Entropy to enable stable gradient descent, computes an orthogonal steepest descent direction to avoid gradient conflicts, and applies gradient projection during post-training to prevent reverting. Empirical results across diverse datasets and partitions show that FedOSD achieves zero unlearning error while preserving or improving retained-client accuracy and avoiding reversion, outperforming state-of-the-art FU methods. The approach offers a principled, efficient pathway to implement federated unlearning with reliable utility preservation and robust privacy guarantees in practical FL deployments.

Abstract

Federated Learning (FL) has received much attention in recent years. However, although clients are not required to share their data in FL, the global model itself can implicitly remember clients' local data. Therefore, it's necessary to effectively remove the target client's data from the FL global model to ease the risk of privacy leakage and implement ``the right to be forgotten". Federated Unlearning (FU) has been considered a promising way to remove data without full retraining. But the model utility easily suffers significant reduction during unlearning due to the gradient conflicts. Furthermore, when conducting the post-training to recover the model utility, the model is prone to move back and revert what has already been unlearned. To address these issues, we propose Federated Unlearning with Orthogonal Steepest Descent (FedOSD). We first design an unlearning Cross-Entropy loss to overcome the convergence issue of the gradient ascent. A steepest descent direction for unlearning is then calculated in the condition of being non-conflicting with other clients' gradients and closest to the target client's gradient. This benefits to efficiently unlearn and mitigate the model utility reduction. After unlearning, we recover the model utility by maintaining the achievement of unlearning. Finally, extensive experiments in several FL scenarios verify that FedOSD outperforms the SOTA FU algorithms in terms of unlearning and model utility.

Federated Unlearning with Gradient Descent and Conflict Mitigation

TL;DR

Abstract

Paper Structure (28 sections, 65 equations, 9 figures, 12 tables, 1 algorithm)

This paper contains 28 sections, 65 equations, 9 figures, 12 tables, 1 algorithm.

Introduction
Background & Related Work
Federated Learning (FL)
Federated Unlearning
The Proposed Approach
Unlearning Cross-Entropy Loss
Orthogonal Steepest Descent Direction
Gradient Projection in Post-training
Experiments
Experimental Setup
Evaluation of Unlearning and Model Utility
Ablation Experiments
Conclusion and Future Work
Theoretical Analysis and Proof
Theoretical Analysis of FedOSD
...and 13 more sections

Figures (9)

Figure 1: A demo of three clients. $g_1, g_2, g_3$ represent the gradient of clients. $d$ denotes the update direction for unlearning client $3$, which is conflicting with $g_1$ and $g_2$, i.e., $g_1\cdot d < 0$ and $g_2\cdot d < 0$. $d_{FedOSD}$ represents the direction obtained by FedOSD, which doesn't conflict with $g_1$ and $g_2$.
Figure 2: The FedOSD framework comprises two main stages: (b) the unlearning stage and (c) the post-training stage. Subfigure (a) depicts the previous FL training procedure before the client requests for unlearning, where the obtained model is denoted as $\omega^0$ and serves as the original model for unlearning.
Figure 3: A comparison between (a) Cross-Entropy and (b) the proposed Unlearning Cross-Entropy. When using CE loss and GA to unlearn, it needs to drive $p_{o,c}$ to 0, leading to gradient explosion and non-convergence. When the target client switches to utilize UCE, it adopts the gradient descent to drive $p_{o,c}$ to 0 and wouldn't bring the convergence issue.
Figure 4: A demo depicting the model reverting issue in post-training. The contour map denotes the local loss of the model on a remaining client. $\omega^0$ is the original model before unlearning. $\omega^{T_u}$ is the model after unlearning. The dashed arrow depicts the path of the model update in post-training, where $\omega^{T_u}$ moves to $\bar{\omega}^{T_u+1}$ and is closer to $\omega^0$. The red arrows indicate a better path obtained by FedOSD.
Figure 5: The ASR, the mean R-Acc, and the distance away from $\omega^0$ during unlearning and post-training stages in the Pat-50 scenario on CIFAR-10.
...and 4 more figures

Federated Unlearning with Gradient Descent and Conflict Mitigation

TL;DR

Abstract

Federated Unlearning with Gradient Descent and Conflict Mitigation

Authors

TL;DR

Abstract

Table of Contents

Figures (9)