Table of Contents
Fetching ...

Harmonizing Generalization and Personalization in Federated Prompt Learning

Tianyu Cui, Hongxia Li, Jingya Wang, Ye Shi

TL;DR

This work proposes Federated Prompt Learning with CLIP Generalization and low-rank Personalization (FedPGP), which employs pre-trained CLIP to provide knowledge-guidance on the global prompt for improved generalization and incorporates a low-rank adaptation term to personalize the global prompt.

Abstract

Federated Prompt Learning (FPL) incorporates large pre-trained Vision-Language models (VLM) into federated learning through prompt tuning. The transferable representations and remarkable generalization capacity of VLM make them highly compatible with the integration of federated learning. Addressing data heterogeneity in federated learning requires personalization, but excessive focus on it across clients could compromise the model's ability to generalize effectively. To preserve the impressive generalization capability of VLM, it is crucial to strike a balance between personalization and generalization in FPL. To tackle this challenge, we proposed Federated Prompt Learning with CLIP Generalization and low-rank Personalization (FedPGP), which employs pre-trained CLIP to provide knowledge-guidance on the global prompt for improved generalization and incorporates a low-rank adaptation term to personalize the global prompt. Further, FedPGP integrates a prompt-wise contrastive loss to achieve knowledge guidance and personalized adaptation simultaneously, enabling a harmonious balance between personalization and generalization in FPL. We conduct extensive experiments on various datasets to explore base-to-novel generalization in both category-level and domain-level scenarios with heterogeneous data, showing the superiority of FedPGP in balancing generalization and personalization.

Harmonizing Generalization and Personalization in Federated Prompt Learning

TL;DR

This work proposes Federated Prompt Learning with CLIP Generalization and low-rank Personalization (FedPGP), which employs pre-trained CLIP to provide knowledge-guidance on the global prompt for improved generalization and incorporates a low-rank adaptation term to personalize the global prompt.

Abstract

Federated Prompt Learning (FPL) incorporates large pre-trained Vision-Language models (VLM) into federated learning through prompt tuning. The transferable representations and remarkable generalization capacity of VLM make them highly compatible with the integration of federated learning. Addressing data heterogeneity in federated learning requires personalization, but excessive focus on it across clients could compromise the model's ability to generalize effectively. To preserve the impressive generalization capability of VLM, it is crucial to strike a balance between personalization and generalization in FPL. To tackle this challenge, we proposed Federated Prompt Learning with CLIP Generalization and low-rank Personalization (FedPGP), which employs pre-trained CLIP to provide knowledge-guidance on the global prompt for improved generalization and incorporates a low-rank adaptation term to personalize the global prompt. Further, FedPGP integrates a prompt-wise contrastive loss to achieve knowledge guidance and personalized adaptation simultaneously, enabling a harmonious balance between personalization and generalization in FPL. We conduct extensive experiments on various datasets to explore base-to-novel generalization in both category-level and domain-level scenarios with heterogeneous data, showing the superiority of FedPGP in balancing generalization and personalization.
Paper Structure (20 sections, 13 equations, 4 figures, 17 tables, 1 algorithm)

This paper contains 20 sections, 13 equations, 4 figures, 17 tables, 1 algorithm.

Figures (4)

  • Figure 1: Pipeline of FedPGP. On the left, clients send global prompts to the server for aggregation while retaining adaptation term locally. The right shows the workflow of CLIP knowledge-guidance and low-rank adaptation with an additional contrastive loss to balance generalization and personalization.
  • Figure 2: Quantitative comparisons on four datasets across varying shot numbers and parameter $\mu$ of contrastive loss in FedPGP over 10 clients.
  • Figure 3: Visual examples of raw instances from two datasets with multiple domains: "Bird" in DomainNet (left) and "Bike" in Office-Caltech10 (right).
  • Figure 4: Accuracy learning curves of FedPGP and baselines on four datasets over 10 clients.