FedSGT: Exact Federated Unlearning via Sequential Group-based Training
Bokang Zhang, Hong Guan, Hong kyu Lee, Ruixuan Liu, Jia Zou, Li Xiong
TL;DR
This work addresses the challenge of exact unlearning in federated learning by introducing FedSGT, which partitions client data into uniform groups and trains multiple lightweight PEFT modules sequentially across several group permutations. Exact unlearning is achieved by deactivating the modules linked to the deleted data, avoiding retraining, while multiple training sequences preserve model utility under accumulating deletions. The authors provide theoretical analyses of deletion rate, performance guarantees, and overhead, and validate FedSGT empirically across vision and language tasks, showing longer service maintenance with comparable accuracy and efficiency to strong baselines. Ablation studies confirm robustness to data heterogeneity and parameter choices, demonstrating FedSGT’s practical viability for compliant, privacy-preserving FL with repeated unlearning requests.
Abstract
Federated Learning (FL) enables collaborative, privacy-preserving model training, but supporting the "Right to be Forgotten" is especially challenging because data influences the model through distributed and interleaved client updates. Existing exact unlearning methods typically require frequent retraining from scratch, resulting in high communication cost and long service downtime. To address this, we propose Federated Sequential Group-based Training (FedSGT), an exact unlearning framework for FL. FedSGT partitions the data into uniform groups, and each client may participate in multiple groups. To control communication overhead, each client can limit the number of groups it contributes to. FedSGT then trains multiple sequences of Parameter-Efficient Fine-Tuning (PEFT) modules, each corresponding to a different group permutation. Since the PEFT modules are lightweight and maintained server-side, FedSGT isolates the influence of different data groups into independent modules without incurring significant storage overhead and communication cost. Exact unlearning is thus achieved instantly by deactivating the modules corresponding to the group containing the unlearned data. Furthermore, using multiple training sequences helps maintain high model utility as deletion requests accumulate. We provide a rigorous theoretical analysis of both the deletion rate -- expected number of deletions before retraining is needed -- and the expected model performance. Experiments on various tasks demonstrate that FedSGT achieves a significantly longer service maintenance under multiple unlearning requests while maintaining comparable learning performance and training efficiency to other exact unlearning baselines. Extensive ablation studies validate the robustness of our method across a wide range of parameter settings.
