Table of Contents
Fetching ...

FederatedScope-LLM: A Comprehensive Package for Fine-tuning Large Language Models in Federated Learning

Weirui Kuang, Bingchen Qian, Zitao Li, Daoyuan Chen, Dawei Gao, Xuchen Pan, Yuexiang Xie, Yaliang Li, Bolin Ding, Jingren Zhou

TL;DR

This work addresses the challenge of federated fine-tuning of large language models under privacy and resource constraints by introducing FS-LLM, an open-source package built on FederatedScope. FS-LLM provides an end-to-end benchmarking pipeline (LLM-Benchmarks), a library of PEFT and privacy-preserving algorithms (LLM-AlgZoo), and an accelerated training framework (LLM-Trainer) to enable efficient, extensible FL experimentation across simulated, distributed, and clustered settings. The authors validate the approach with extensive experiments, showing LoRA commonly yields strong performance gains, exploring privacy-preserving options like FedOT, and analyzing personalized FL and FedHPO in FL-LLMs. The work offers practical tooling and insights to advance federated fine-tuning research and is released at the given GitHub repository for community adoption.

Abstract

LLMs have demonstrated great capabilities in various NLP tasks. Different entities can further improve the performance of those LLMs on their specific downstream tasks by fine-tuning LLMs. When several entities have similar interested tasks, but their data cannot be shared because of privacy concerns regulations, federated learning (FL) is a mainstream solution to leverage the data of different entities. However, fine-tuning LLMs in federated learning settings still lacks adequate support from existing FL frameworks because it has to deal with optimizing the consumption of significant communication and computational resources, data preparation for different tasks, and distinct information protection demands. This paper first discusses these challenges of federated fine-tuning LLMs, and introduces our package FS-LLM as a main contribution, which consists of the following components: (1) we build an end-to-end benchmarking pipeline, automizing the processes of dataset preprocessing, federated fine-tuning execution, and performance evaluation on federated LLM fine-tuning; (2) we provide comprehensive federated parameter-efficient fine-tuning algorithm implementations and versatile programming interfaces for future extension in FL scenarios with low communication and computation costs, even without accessing the full model; (3) we adopt several accelerating and resource-efficient operators for fine-tuning LLMs with limited resources and the flexible pluggable sub-routines for interdisciplinary study. We conduct extensive experiments to validate the effectiveness of FS-LLM and benchmark advanced LLMs with state-of-the-art parameter-efficient fine-tuning algorithms in FL settings, which also yields valuable insights into federated fine-tuning LLMs for the research community. To facilitate further research and adoption, we release FS-LLM at https://github.com/alibaba/FederatedScope/tree/llm.

FederatedScope-LLM: A Comprehensive Package for Fine-tuning Large Language Models in Federated Learning

TL;DR

This work addresses the challenge of federated fine-tuning of large language models under privacy and resource constraints by introducing FS-LLM, an open-source package built on FederatedScope. FS-LLM provides an end-to-end benchmarking pipeline (LLM-Benchmarks), a library of PEFT and privacy-preserving algorithms (LLM-AlgZoo), and an accelerated training framework (LLM-Trainer) to enable efficient, extensible FL experimentation across simulated, distributed, and clustered settings. The authors validate the approach with extensive experiments, showing LoRA commonly yields strong performance gains, exploring privacy-preserving options like FedOT, and analyzing personalized FL and FedHPO in FL-LLMs. The work offers practical tooling and insights to advance federated fine-tuning research and is released at the given GitHub repository for community adoption.

Abstract

LLMs have demonstrated great capabilities in various NLP tasks. Different entities can further improve the performance of those LLMs on their specific downstream tasks by fine-tuning LLMs. When several entities have similar interested tasks, but their data cannot be shared because of privacy concerns regulations, federated learning (FL) is a mainstream solution to leverage the data of different entities. However, fine-tuning LLMs in federated learning settings still lacks adequate support from existing FL frameworks because it has to deal with optimizing the consumption of significant communication and computational resources, data preparation for different tasks, and distinct information protection demands. This paper first discusses these challenges of federated fine-tuning LLMs, and introduces our package FS-LLM as a main contribution, which consists of the following components: (1) we build an end-to-end benchmarking pipeline, automizing the processes of dataset preprocessing, federated fine-tuning execution, and performance evaluation on federated LLM fine-tuning; (2) we provide comprehensive federated parameter-efficient fine-tuning algorithm implementations and versatile programming interfaces for future extension in FL scenarios with low communication and computation costs, even without accessing the full model; (3) we adopt several accelerating and resource-efficient operators for fine-tuning LLMs with limited resources and the flexible pluggable sub-routines for interdisciplinary study. We conduct extensive experiments to validate the effectiveness of FS-LLM and benchmark advanced LLMs with state-of-the-art parameter-efficient fine-tuning algorithms in FL settings, which also yields valuable insights into federated fine-tuning LLMs for the research community. To facilitate further research and adoption, we release FS-LLM at https://github.com/alibaba/FederatedScope/tree/llm.
Paper Structure (26 sections, 2 equations, 6 figures, 16 tables)

This paper contains 26 sections, 2 equations, 6 figures, 16 tables.

Figures (6)

  • Figure 1: Overview of the architecture of FS-LLM, which consists of three main modules: LLM-Benchmarks, LLM-AlgZoo, and LLM-Trainer. As an example in the figure, we use the PEFT algorithms to fine-tune LLaMA llama in FL, with half-precision mixprec training and offloading offloading strategy and pFedMe pFedME algorithm. Under this learning paradigm, the clients can efficiently train on their local data with limited hardware resources, while the communication between the clients and the server only requires transmitting the adapter (which typically has very few parameters). This achieves high efficiency in both communication and computation. In the figure, Acc. stands for accelerating operator, Perf. stands for performance, Comm. stands for communication, Comp. stands for computation, and Fair. stands for fairness.
  • Figure 2: The unified interfaces for federated fine-tuning LLMs with or without accessing the full model. When the LLM is not accessible to clients, different algorithms can be used to generate an emulator, including distillation, pruning, and quantization via ① LLM model pre-processing interface; if the LLM is accessible, ① just output the input by default. The other three interfaces in the figure are ② initial model broadcast, ③ shared parameter aggregation, and ④ parameter re-distribution.
  • Figure 3: FS-LLM integrates DeepSpeed for federated fine-tuning in different hardware conditions. Rank $0$ indicates the main process for multi-GPU training, and some modules of other subprocesses are disabled (e.g., logging and saving checkpoints). Msg stands for messages transmitted between the server and the clients, which trigger the events to happen.
  • Figure 4: Visualization of the performance comparison of fine-tuned LLaMA-7B and OPT-2.7B under federated and local scenarios. (The axes are scaled to highlight the differences.)
  • Figure 5: Fine-tuning LLMs in pFL (Left) and FedHPO (Right).
  • ...and 1 more figures