Table of Contents
Fetching ...

Efficient Federated Prompt Tuning for Black-box Large Pre-trained Models

Zihao Lin, Yan Sun, Yifan Shi, Xueqian Wang, Lifu Huang, Li Shen, Dacheng Tao

TL;DR

The paper addresses distribution shifts in decentralized federated learning by integrating sharpness-aware minimization (SAM) into a gossip-based framework. It introduces two algorithms, DFedSAM and DFedSAM-MG, combining SAM updates with decentralized consensus and, in the MG variant, multiple gossip steps to accelerate convergence. Theoretical results provide convergence and generalization guarantees in non-convex settings, highlighting the influence of network topology and consensus depth. Empirical evaluation across different topologies corroborates improved generalization and robustness to communication structure. The work offers a principled approach to memory-efficient, private, and privacy-preserving learning on black-box model deployments via decentralized optimization with SAM.

Abstract

With the blowout development of pre-trained models (PTMs), the efficient tuning of these models for diverse downstream applications has emerged as a pivotal research concern. Although recent investigations into prompt tuning have provided promising avenues, three salient challenges persist: (1) memory constraint: the continuous growth in the size of open-source PTMs renders fine-tuning, even a fraction of their parameters, challenging for many practitioners. (2) model privacy: existing PTMs often function as public API services, with their parameters inaccessible for effective or tailored fine-tuning. (3) data privacy: the fine-tuning of PTMs necessitates high-quality datasets, which are typically localized and not shared to public. To optimally harness each local dataset while navigating memory constraints and preserving privacy, we propose Federated Black-Box Prompt Tuning (Fed-BBPT). This innovative approach eschews reliance on parameter architectures and private dataset access, instead capitalizing on a central server that aids local users in collaboratively training a prompt generator through regular aggregation. Local users leverage API-driven learning via a zero-order optimizer, obviating the need for PTM deployment. Relative to extensive fine-tuning, Fed-BBPT proficiently sidesteps memory challenges tied to PTM storage and fine-tuning on local machines, tapping into comprehensive, high-quality, yet private training datasets. A thorough evaluation across 40 datasets spanning CV and NLP tasks underscores the robustness of our proposed model.

Efficient Federated Prompt Tuning for Black-box Large Pre-trained Models

TL;DR

The paper addresses distribution shifts in decentralized federated learning by integrating sharpness-aware minimization (SAM) into a gossip-based framework. It introduces two algorithms, DFedSAM and DFedSAM-MG, combining SAM updates with decentralized consensus and, in the MG variant, multiple gossip steps to accelerate convergence. Theoretical results provide convergence and generalization guarantees in non-convex settings, highlighting the influence of network topology and consensus depth. Empirical evaluation across different topologies corroborates improved generalization and robustness to communication structure. The work offers a principled approach to memory-efficient, private, and privacy-preserving learning on black-box model deployments via decentralized optimization with SAM.

Abstract

With the blowout development of pre-trained models (PTMs), the efficient tuning of these models for diverse downstream applications has emerged as a pivotal research concern. Although recent investigations into prompt tuning have provided promising avenues, three salient challenges persist: (1) memory constraint: the continuous growth in the size of open-source PTMs renders fine-tuning, even a fraction of their parameters, challenging for many practitioners. (2) model privacy: existing PTMs often function as public API services, with their parameters inaccessible for effective or tailored fine-tuning. (3) data privacy: the fine-tuning of PTMs necessitates high-quality datasets, which are typically localized and not shared to public. To optimally harness each local dataset while navigating memory constraints and preserving privacy, we propose Federated Black-Box Prompt Tuning (Fed-BBPT). This innovative approach eschews reliance on parameter architectures and private dataset access, instead capitalizing on a central server that aids local users in collaboratively training a prompt generator through regular aggregation. Local users leverage API-driven learning via a zero-order optimizer, obviating the need for PTM deployment. Relative to extensive fine-tuning, Fed-BBPT proficiently sidesteps memory challenges tied to PTM storage and fine-tuning on local machines, tapping into comprehensive, high-quality, yet private training datasets. A thorough evaluation across 40 datasets spanning CV and NLP tasks underscores the robustness of our proposed model.
Paper Structure (9 sections, 5 theorems, 28 equations, 1 figure, 2 algorithms)

This paper contains 9 sections, 5 theorems, 28 equations, 1 figure, 2 algorithms.

Key Result

Theorem 4.1

When we use the multiple compressed gossiping steps (the number of consensus steps per gradient iteration $Q \ge 1$) to achieve a faster convergence rate and consider the impact of data homogeneity upon the rate of convergence, under Definitionnoniid_para and Assumption 1-3, we can generate the resu When the perturbation amplitude $\rho$ is proportional to the learning rate, e.g., $\rho = \mathcal

Theorems & Definitions (5)

  • Theorem 4.1
  • Lemma 4.2
  • Lemma 4.3
  • Lemma 4.4
  • Lemma 4.5