Table of Contents
Fetching ...

Scalable Federated Unlearning via Isolated and Coded Sharding

Yijing Lin, Zhipeng Gao, Hongyang Du, Dusit Niyato, Gui Gui, Shuguang Cui, Jinke Ren

TL;DR

This work tackles the storage and computation bottlenecks of federated unlearning by introducing a two-tier framework: stage-based isolated sharding to limit the number of affected clients, and coded sharding to compress and distribute intermediate model parameters. The authors provide theoretical time-efficiency bounds, with $T_s = K \overline{C}_t$ for sequential unlearning and $T_c = S \overline{C}_t \left(1 - \left(1 - \dfrac{1}{S}\right)^K\right)$ for concurrent unlearning, and storage/throughput guarantees for coded sharding, including $\gamma_f=1$, $\gamma_s=S$, and $\gamma_c \le (1-2\mu)C$ with $\lambda_c = \dfrac{S}{O(C^2 \log^2 C \log \log C)}$. Empirical results on classification and generation tasks demonstrate substantial improvements: retraining time reductions of roughly 65–70% and storage overhead reductions up to 98% compared with state-of-the-art baselines, while maintaining comparable unlearning effectiveness. The approach advances the practicality of federated unlearning in privacy-regulated settings by reducing resource demands and enabling scalable, provably efficient data-forgetting operations.

Abstract

Federated unlearning has emerged as a promising paradigm to erase the client-level data effect without affecting the performance of collaborative learning models. However, the federated unlearning process often introduces extensive storage overhead and consumes substantial computational resources, thus hindering its implementation in practice. To address this issue, this paper proposes a scalable federated unlearning framework based on isolated sharding and coded computing. We first divide distributed clients into multiple isolated shards across stages to reduce the number of clients being affected. Then, to reduce the storage overhead of the central server, we develop a coded computing mechanism by compressing the model parameters across different shards. In addition, we provide the theoretical analysis of time efficiency and storage effectiveness for the isolated and coded sharding. Finally, extensive experiments on two typical learning tasks, i.e., classification and generation, demonstrate that our proposed framework can achieve better performance than three state-of-the-art frameworks in terms of accuracy, retraining time, storage overhead, and F1 scores for resisting membership inference attacks.

Scalable Federated Unlearning via Isolated and Coded Sharding

TL;DR

This work tackles the storage and computation bottlenecks of federated unlearning by introducing a two-tier framework: stage-based isolated sharding to limit the number of affected clients, and coded sharding to compress and distribute intermediate model parameters. The authors provide theoretical time-efficiency bounds, with for sequential unlearning and for concurrent unlearning, and storage/throughput guarantees for coded sharding, including , , and with . Empirical results on classification and generation tasks demonstrate substantial improvements: retraining time reductions of roughly 65–70% and storage overhead reductions up to 98% compared with state-of-the-art baselines, while maintaining comparable unlearning effectiveness. The approach advances the practicality of federated unlearning in privacy-regulated settings by reducing resource demands and enabling scalable, provably efficient data-forgetting operations.

Abstract

Federated unlearning has emerged as a promising paradigm to erase the client-level data effect without affecting the performance of collaborative learning models. However, the federated unlearning process often introduces extensive storage overhead and consumes substantial computational resources, thus hindering its implementation in practice. To address this issue, this paper proposes a scalable federated unlearning framework based on isolated sharding and coded computing. We first divide distributed clients into multiple isolated shards across stages to reduce the number of clients being affected. Then, to reduce the storage overhead of the central server, we develop a coded computing mechanism by compressing the model parameters across different shards. In addition, we provide the theoretical analysis of time efficiency and storage effectiveness for the isolated and coded sharding. Finally, extensive experiments on two typical learning tasks, i.e., classification and generation, demonstrate that our proposed framework can achieve better performance than three state-of-the-art frameworks in terms of accuracy, retraining time, storage overhead, and F1 scores for resisting membership inference attacks.
Paper Structure (13 sections, 13 equations, 5 figures, 1 table, 1 algorithm)

This paper contains 13 sections, 13 equations, 5 figures, 1 table, 1 algorithm.

Figures (5)

  • Figure 1: Federated Learning vs. Federated Unlearning. (a) In federated learning, the clients and the server collaboratively train a global model by exchanging model parameters. (b) In federated unlearning, upon receiving an unlearning request from a specific client $C^{\prime}$, the well-trained global model will be unlearned by using the local models of other clients for calibration, thus removing the corresponding data effect.
  • Figure 2: Scalable Federated Unlearning Framework. For the unlearning requests initiated at different stages, only affected shards perform calibrations to remove the data effects of specific clients. To reduce storage overhead and improve scalability, intermediate model parameters are encoded/decoded as slices for efficient communication between clients and servers.
  • Figure 3: Performance with a single unlearning request.
  • Figure 4: Performance with concurrent unlearning requests.
  • Figure 5: Communication time and storage overhead of different frameworks with concurrent adaptive unlearning requests.