Table of Contents
Fetching ...

Towards Building the Federated GPT: Federated Instruction Tuning

Jianyi Zhang, Saeed Vahidian, Martin Kuo, Chunyuan Li, Ruiyi Zhang, Tong Yu, Yufan Zhou, Guoyin Wang, Yiran Chen

TL;DR

Federated Instruction Tuning (FedIT) proposes privacy-preserving, distributed instruction tuning for large language models by combining Federated Learning with Parameter-Efficient Fine-Tuning (LoRA). Clients keep instructions locally and share only trainable adapters, which are aggregated via FedAvg to form a joint model, significantly reducing communication and computation. The authors introduce Shepherd, a GitHub-based framework to enable FedIT experimentation across heterogeneous datasets, and demonstrate through GPT-4 automatic evaluation that FedIT leverages data diversity to improve generalization beyond local fine-tuning, approaching centralized performance in some cases. They also discuss practical considerations and future directions, including overhead, privacy, personalization, and defense against malicious participants. The work highlights data heterogeneity as a potential strength for FL-based tuning and provides a concrete platform for further research in privacy-preserving instruction alignment of LLMs.

Abstract

While "instruction-tuned" generative large language models (LLMs) have demonstrated an impressive ability to generalize to new tasks, the training phases heavily rely on large amounts of diverse and high-quality instruction data (such as ChatGPT and GPT-4). Unfortunately, acquiring high-quality data, especially when it comes to human-written data, can pose significant challenges both in terms of cost and accessibility. Moreover, concerns related to privacy can further limit access to such data, making the process of obtaining it a complex and nuanced undertaking. Consequently, this hinders the generality of the tuned models and may restrict their effectiveness in certain contexts. To tackle this issue, our study introduces a new approach called Federated Instruction Tuning (FedIT), which leverages federated learning (FL) as the learning framework for the instruction tuning of LLMs. This marks the first exploration of FL-based instruction tuning for LLMs. This is especially important since text data is predominantly generated by end users. Therefore, it is imperative to design and adapt FL approaches to effectively leverage these users' diverse instructions stored on local devices, while preserving privacy and ensuring data security. In the current paper, by conducting widely used GPT-4 auto-evaluation, we demonstrate that by exploiting the heterogeneous and diverse sets of instructions on the client's end with the proposed framework FedIT, we improved the performance of LLMs compared to centralized training with only limited local instructions. Further, in this paper, we developed a Github repository named Shepherd. This repository offers a foundational framework for exploring federated fine-tuning of LLMs using heterogeneous instructions across diverse categories.

Towards Building the Federated GPT: Federated Instruction Tuning

TL;DR

Federated Instruction Tuning (FedIT) proposes privacy-preserving, distributed instruction tuning for large language models by combining Federated Learning with Parameter-Efficient Fine-Tuning (LoRA). Clients keep instructions locally and share only trainable adapters, which are aggregated via FedAvg to form a joint model, significantly reducing communication and computation. The authors introduce Shepherd, a GitHub-based framework to enable FedIT experimentation across heterogeneous datasets, and demonstrate through GPT-4 automatic evaluation that FedIT leverages data diversity to improve generalization beyond local fine-tuning, approaching centralized performance in some cases. They also discuss practical considerations and future directions, including overhead, privacy, personalization, and defense against malicious participants. The work highlights data heterogeneity as a potential strength for FL-based tuning and provides a concrete platform for further research in privacy-preserving instruction alignment of LLMs.

Abstract

While "instruction-tuned" generative large language models (LLMs) have demonstrated an impressive ability to generalize to new tasks, the training phases heavily rely on large amounts of diverse and high-quality instruction data (such as ChatGPT and GPT-4). Unfortunately, acquiring high-quality data, especially when it comes to human-written data, can pose significant challenges both in terms of cost and accessibility. Moreover, concerns related to privacy can further limit access to such data, making the process of obtaining it a complex and nuanced undertaking. Consequently, this hinders the generality of the tuned models and may restrict their effectiveness in certain contexts. To tackle this issue, our study introduces a new approach called Federated Instruction Tuning (FedIT), which leverages federated learning (FL) as the learning framework for the instruction tuning of LLMs. This marks the first exploration of FL-based instruction tuning for LLMs. This is especially important since text data is predominantly generated by end users. Therefore, it is imperative to design and adapt FL approaches to effectively leverage these users' diverse instructions stored on local devices, while preserving privacy and ensuring data security. In the current paper, by conducting widely used GPT-4 auto-evaluation, we demonstrate that by exploiting the heterogeneous and diverse sets of instructions on the client's end with the proposed framework FedIT, we improved the performance of LLMs compared to centralized training with only limited local instructions. Further, in this paper, we developed a Github repository named Shepherd. This repository offers a foundational framework for exploring federated fine-tuning of LLMs using heterogeneous instructions across diverse categories.
Paper Structure (24 sections, 1 equation, 3 figures, 5 tables, 1 algorithm)

This paper contains 24 sections, 1 equation, 3 figures, 5 tables, 1 algorithm.

Figures (3)

  • Figure 1: The framework of Federated Instruction Tuning (FedIT)
  • Figure 2: Illustrate the heterogeneity of FedIT with Databricks-dolly-15k instruction dataset. The model can be trained on only the particular local instruction categories of each user (bottom left), or on the local instruction datasets of all clients with greater diversity and quantity of data points that cover the entire range of the subject matter by implementing our FedIT (bottom right).
  • Figure 3: The relative scores of all models against ChatGPT(GPT-3.5-turbo)