Table of Contents
Fetching ...

DiPrompT: Disentangled Prompt Tuning for Multiple Latent Domain Generalization in Federated Learning

Sikai Bai, Jie Zhang, Shuaicheng Li, Song Guo, Jingcai Guo, Jun Hou, Tao Han, Xiaocheng Lu

TL;DR

The proposed Disentangled Prompt Tuning (DiPrompT) achieves superior domain generalization performance over state-of-the-art FL methods when domain labels are not provided, and even outperforms many centralized learning methods using domain labels.

Abstract

Federated learning (FL) has emerged as a powerful paradigm for learning from decentralized data, and federated domain generalization further considers the test dataset (target domain) is absent from the decentralized training data (source domains). However, most existing FL methods assume that domain labels are provided during training, and their evaluation imposes explicit constraints on the number of domains, which must strictly match the number of clients. Because of the underutilization of numerous edge devices and additional cross-client domain annotations in the real world, such restrictions may be impractical and involve potential privacy leaks. In this paper, we propose an efficient and novel approach, called Disentangled Prompt Tuning (DiPrompT), a method that tackles the above restrictions by learning adaptive prompts for domain generalization in a distributed manner. Specifically, we first design two types of prompts, i.e., global prompt to capture general knowledge across all clients and domain prompts to capture domain-specific knowledge. They eliminate the restriction on the one-to-one mapping between source domains and local clients. Furthermore, a dynamic query metric is introduced to automatically search the suitable domain label for each sample, which includes two-substep text-image alignments based on prompt tuning without labor-intensive annotation. Extensive experiments on multiple datasets demonstrate that our DiPrompT achieves superior domain generalization performance over state-of-the-art FL methods when domain labels are not provided, and even outperforms many centralized learning methods using domain labels.

DiPrompT: Disentangled Prompt Tuning for Multiple Latent Domain Generalization in Federated Learning

TL;DR

The proposed Disentangled Prompt Tuning (DiPrompT) achieves superior domain generalization performance over state-of-the-art FL methods when domain labels are not provided, and even outperforms many centralized learning methods using domain labels.

Abstract

Federated learning (FL) has emerged as a powerful paradigm for learning from decentralized data, and federated domain generalization further considers the test dataset (target domain) is absent from the decentralized training data (source domains). However, most existing FL methods assume that domain labels are provided during training, and their evaluation imposes explicit constraints on the number of domains, which must strictly match the number of clients. Because of the underutilization of numerous edge devices and additional cross-client domain annotations in the real world, such restrictions may be impractical and involve potential privacy leaks. In this paper, we propose an efficient and novel approach, called Disentangled Prompt Tuning (DiPrompT), a method that tackles the above restrictions by learning adaptive prompts for domain generalization in a distributed manner. Specifically, we first design two types of prompts, i.e., global prompt to capture general knowledge across all clients and domain prompts to capture domain-specific knowledge. They eliminate the restriction on the one-to-one mapping between source domains and local clients. Furthermore, a dynamic query metric is introduced to automatically search the suitable domain label for each sample, which includes two-substep text-image alignments based on prompt tuning without labor-intensive annotation. Extensive experiments on multiple datasets demonstrate that our DiPrompT achieves superior domain generalization performance over state-of-the-art FL methods when domain labels are not provided, and even outperforms many centralized learning methods using domain labels.
Paper Structure (22 sections, 8 equations, 5 figures, 3 tables)

This paper contains 22 sections, 8 equations, 5 figures, 3 tables.

Figures (5)

  • Figure 1: The motivation example and our main idea. (a) When clients outnumber source domains, learning domain-invariant features may become challenging due to imbalanced contributions across domains/clients. Note that the contribution imbalance of local data is measured through its feature distances with the target domain. (b) DiPrompT separates domain-specific features and general knowledge during local training. (c) DiPrompT adaptive ensembles for generic and valuable specific knowledge for better target domain prediction during inference.
  • Figure 2: Illustration of Disentangled Prompt Tuning (DiPrompT) during traing. We devise an alternative optimization strategy to update two key modules (disentangled prompt learning and dynamic query scheme), which mainly contain six steps except communication between clients and server. We first generate image embeddings and update Q-Prompt via steps 2 and 3, respectively. Then the suitable D-Prompt is selected using Q-prompt in step 4. G-Prompt and D-Prompt in disentangled prompt learning are simultaneously optimized using steps 5 and 6. Finally, we perform the beta moving average update for D-Prompts to avoid client drift in the central server in step 8.
  • Figure 3: Analysis in terms of few shots settings and different backbone architectures on PACS and VLCS datasets.
  • Figure 4: Hyperparameters analysis in terms of the number of clients $K$ and weight coefficient $\lambda$ on PACS.
  • Figure 5: Comparison of computation and communication cost of DiPrompT and other federated domain generalization methods.