Table of Contents
Fetching ...

A Survey on Federated Fine-tuning of Large Language Models

Yebo Wu, Chunlin Tian, Jingguang Li, He Sun, Kahou Tam, Zhanting Zhou, Haicheng Liao, Jing Xiong, Zhijiang Guo, Li Li, Chengzhong Xu

TL;DR

This survey provides a systematic and comprehensive review of FedLLM, and conducts an extensive examination of existing Parameter-Efficient Fine-tuning methods and explores their applicability within the FL framework.

Abstract

Large Language Models (LLMs) have demonstrated impressive success across various tasks. Integrating LLMs with Federated Learning (FL), a paradigm known as FedLLM, offers a promising avenue for collaborative model adaptation while preserving data privacy. This survey provides a systematic and comprehensive review of FedLLM. We begin by tracing the historical development of both LLMs and FL, summarizing relevant prior research to set the context. Subsequently, we delve into an in-depth analysis of the fundamental challenges inherent in deploying FedLLM. Addressing these challenges often requires efficient adaptation strategies; therefore, we conduct an extensive examination of existing Parameter-Efficient Fine-tuning (PEFT) methods and explore their applicability within the FL framework. To rigorously evaluate the performance of FedLLM, we undertake a thorough review of existing fine-tuning datasets and evaluation benchmarks. Furthermore, we discuss FedLLM's diverse real-world applications across multiple domains. Finally, we identify critical open challenges and outline promising research directions to foster future advancements in FedLLM. This survey aims to serve as a foundational resource for researchers and practitioners, offering valuable insights into the rapidly evolving landscape of federated fine-tuning for LLMs. It also establishes a roadmap for future innovations in privacy-preserving AI. We actively maintain a \href{https://github.com/Clin0212/Awesome-Federated-LLM-Learning}{GitHub repo} to track cutting-edge advancements in this field.

A Survey on Federated Fine-tuning of Large Language Models

TL;DR

This survey provides a systematic and comprehensive review of FedLLM, and conducts an extensive examination of existing Parameter-Efficient Fine-tuning methods and explores their applicability within the FL framework.

Abstract

Large Language Models (LLMs) have demonstrated impressive success across various tasks. Integrating LLMs with Federated Learning (FL), a paradigm known as FedLLM, offers a promising avenue for collaborative model adaptation while preserving data privacy. This survey provides a systematic and comprehensive review of FedLLM. We begin by tracing the historical development of both LLMs and FL, summarizing relevant prior research to set the context. Subsequently, we delve into an in-depth analysis of the fundamental challenges inherent in deploying FedLLM. Addressing these challenges often requires efficient adaptation strategies; therefore, we conduct an extensive examination of existing Parameter-Efficient Fine-tuning (PEFT) methods and explore their applicability within the FL framework. To rigorously evaluate the performance of FedLLM, we undertake a thorough review of existing fine-tuning datasets and evaluation benchmarks. Furthermore, we discuss FedLLM's diverse real-world applications across multiple domains. Finally, we identify critical open challenges and outline promising research directions to foster future advancements in FedLLM. This survey aims to serve as a foundational resource for researchers and practitioners, offering valuable insights into the rapidly evolving landscape of federated fine-tuning for LLMs. It also establishes a roadmap for future innovations in privacy-preserving AI. We actively maintain a \href{https://github.com/Clin0212/Awesome-Federated-LLM-Learning}{GitHub repo} to track cutting-edge advancements in this field.

Paper Structure

This paper contains 50 sections, 6 equations, 12 figures, 14 tables.

Figures (12)

  • Figure 1: Illustration of three LLM fine-tuning paradigms: (a) Centralized Fine-tuning, where data is aggregated at a central server; (b) Local Fine-tuning, where models are trained independently on private datasets; and (c) Federated Fine-tuning, where data remains local, and model updates are aggregated by a central server to create a global model.
  • Figure 2: Overall structure of the survey.
  • Figure 3: Architecture of LLMs.
  • Figure 4: Schematic illustration of the two-stage LLM training process: 1) auto-regressive pre-training on large-scale corpora to develop general linguistic capabilities, followed by 2) supervised fine-tuning to align model outputs with specific task requirements or human preferences.
  • Figure 5: Comparison of model parameters across BERT and LLaMA series models.
  • ...and 7 more figures