Table of Contents
Fetching ...

Oblivionis: A Lightweight Learning and Unlearning Framework for Federated Large Language Models

Fuyao Zhang, Xinyu Yan, Tiantong Wu, Wenjie Li, Tianxiang Chen, Yang Cao, Ran Yan, Longtao Huang, Wei Yang Bryan Lim, Qiang Yang

TL;DR

Oblivionis tackles the challenge of forgetting specific private data in federated LLMs by introducing a lightweight dual-objective framework that combines federated fine-tuning and targeted unlearning, with LoRA enabling parameter-efficient updates. The approach is evaluated across six FL methods and five unlearning methods on TOFU and MUSE benchmarks, showing that Oblivionis typically improves forgetting while preserving utility compared to local training. Key findings include strong forgetting performance from AOFL methods (e.g., FedAdagrad with SimNPO/NPO) and robust utility-keeping from proximal FL methods (e.g., FedProx) in contextual QA tasks. The work provides an open-source platform, standardized benchmarks, and comprehensive cross-algorithm analyses to foster reproducible research in federated LLM unlearning and its regulatory implications.

Abstract

Large Language Models (LLMs) increasingly leverage Federated Learning (FL) to utilize private, task-specific datasets for fine-tuning while preserving data privacy. However, while federated LLM frameworks effectively enable collaborative training without raw data sharing, they critically lack built-in mechanisms for regulatory compliance like GDPR's right to be forgotten. Integrating private data heightens concerns over data quality and long-term governance, yet existing distributed training frameworks offer no principled way to selectively remove specific client contributions post-training. Due to distributed data silos, stringent privacy constraints, and the intricacies of interdependent model aggregation, federated LLM unlearning is significantly more complex than centralized LLM unlearning. To address this gap, we introduce Oblivionis, a lightweight learning and unlearning framework that enables clients to selectively remove specific private data during federated LLM training, enhancing trustworthiness and regulatory compliance. By unifying FL and unlearning as a dual optimization objective, we incorporate 6 FL and 5 unlearning algorithms for comprehensive evaluation and comparative analysis, establishing a robust pipeline for federated LLM unlearning. Extensive experiments demonstrate that Oblivionis outperforms local training, achieving a robust balance between forgetting efficacy and model utility, with cross-algorithm comparisons providing clear directions for future LLM development.

Oblivionis: A Lightweight Learning and Unlearning Framework for Federated Large Language Models

TL;DR

Oblivionis tackles the challenge of forgetting specific private data in federated LLMs by introducing a lightweight dual-objective framework that combines federated fine-tuning and targeted unlearning, with LoRA enabling parameter-efficient updates. The approach is evaluated across six FL methods and five unlearning methods on TOFU and MUSE benchmarks, showing that Oblivionis typically improves forgetting while preserving utility compared to local training. Key findings include strong forgetting performance from AOFL methods (e.g., FedAdagrad with SimNPO/NPO) and robust utility-keeping from proximal FL methods (e.g., FedProx) in contextual QA tasks. The work provides an open-source platform, standardized benchmarks, and comprehensive cross-algorithm analyses to foster reproducible research in federated LLM unlearning and its regulatory implications.

Abstract

Large Language Models (LLMs) increasingly leverage Federated Learning (FL) to utilize private, task-specific datasets for fine-tuning while preserving data privacy. However, while federated LLM frameworks effectively enable collaborative training without raw data sharing, they critically lack built-in mechanisms for regulatory compliance like GDPR's right to be forgotten. Integrating private data heightens concerns over data quality and long-term governance, yet existing distributed training frameworks offer no principled way to selectively remove specific client contributions post-training. Due to distributed data silos, stringent privacy constraints, and the intricacies of interdependent model aggregation, federated LLM unlearning is significantly more complex than centralized LLM unlearning. To address this gap, we introduce Oblivionis, a lightweight learning and unlearning framework that enables clients to selectively remove specific private data during federated LLM training, enhancing trustworthiness and regulatory compliance. By unifying FL and unlearning as a dual optimization objective, we incorporate 6 FL and 5 unlearning algorithms for comprehensive evaluation and comparative analysis, establishing a robust pipeline for federated LLM unlearning. Extensive experiments demonstrate that Oblivionis outperforms local training, achieving a robust balance between forgetting efficacy and model utility, with cross-algorithm comparisons providing clear directions for future LLM development.

Paper Structure

This paper contains 49 sections, 20 equations, 20 figures, 10 tables.

Figures (20)

  • Figure 1: Illustration of the three-step LLM training process: (1) Pre-training the base model with public datasets on a centralized server; (2) Federated fine-tuning on the base model using private and sensitive task-specific data ; (3) Federated targeted unlearning removes the influence of specific data upon client requests, addressing regulatory and ethical requirements. Areas enclosed by grey dashed boxes are our main contributions.
  • Figure 2: (a) Overview of the proposed Oblivionis framework. (b)Oblivionis integrates 6 representative federated learning algorithms, 5 machine unlearning methods, 2 federated fine-tuning methods (full-parameter and LoRA-based), and a variety of models. Oblivionis also supports 5 datasets and over 10 evaluation metrics. (c) Sample experimental results that showcase the divergent performance of 6 FL methods using SimNPO unlearning algorithm on the TOFU dataset.
  • Figure 3: Comparative analysis of ROUGE scores across federated learning and unlearning methods using Llama-3.2-1B model with Split99 strategies. For the Forget set, lower scores indicate better performance ($\downarrow$), whereas for the remaining sets, higher scores are preferable ($\uparrow$).
  • Figure 4: Comparative analysis of Probability scores across federated learning and unlearning methods using Llama-3.2-1B model with Split99 strategies. For the Forget set, lower scores indicate better performance ($\downarrow$), whereas for the remaining sets, higher scores are preferable ($\uparrow$).
  • Figure 5: Comparison of Model Utility(MU) between local and federated learning across different unlearning methods.
  • ...and 15 more figures