Table of Contents
Fetching ...

PromptTA: Prompt-driven Text Adapter for Source-free Domain Generalization

Haoran Zhang, Shuanghao Bai, Wanqi Zhou, Jingwen Fu, Badong Chen

TL;DR

This work proposes Prompt-Driven Text Adapter (PromptTA) method, which is designed to better capture the distribution of style features and employ resampling to ensure thorough coverage of domain knowledge.

Abstract

Source-free domain generalization (SFDG) tackles the challenge of adapting models to unseen target domains without access to source domain data. To deal with this challenging task, recent advances in SFDG have primarily focused on leveraging the text modality of vision-language models such as CLIP. These methods involve developing a transferable linear classifier based on diverse style features extracted from the text and learned prompts or deriving domain-unified text representations from domain banks. However, both style features and domain banks have limitations in capturing comprehensive domain knowledge. In this work, we propose Prompt-Driven Text Adapter (PromptTA) method, which is designed to better capture the distribution of style features and employ resampling to ensure thorough coverage of domain knowledge. To further leverage this rich domain information, we introduce a text adapter that learns from these style features for efficient domain information storage. Extensive experiments conducted on four benchmark datasets demonstrate that PromptTA achieves state-of-the-art performance. The code is available at https://github.com/zhanghr2001/PromptTA.

PromptTA: Prompt-driven Text Adapter for Source-free Domain Generalization

TL;DR

This work proposes Prompt-Driven Text Adapter (PromptTA) method, which is designed to better capture the distribution of style features and employ resampling to ensure thorough coverage of domain knowledge.

Abstract

Source-free domain generalization (SFDG) tackles the challenge of adapting models to unseen target domains without access to source domain data. To deal with this challenging task, recent advances in SFDG have primarily focused on leveraging the text modality of vision-language models such as CLIP. These methods involve developing a transferable linear classifier based on diverse style features extracted from the text and learned prompts or deriving domain-unified text representations from domain banks. However, both style features and domain banks have limitations in capturing comprehensive domain knowledge. In this work, we propose Prompt-Driven Text Adapter (PromptTA) method, which is designed to better capture the distribution of style features and employ resampling to ensure thorough coverage of domain knowledge. To further leverage this rich domain information, we introduce a text adapter that learns from these style features for efficient domain information storage. Extensive experiments conducted on four benchmark datasets demonstrate that PromptTA achieves state-of-the-art performance. The code is available at https://github.com/zhanghr2001/PromptTA.
Paper Structure (8 sections, 6 equations, 4 figures, 3 tables)

This paper contains 8 sections, 6 equations, 4 figures, 3 tables.

Figures (4)

  • Figure 1: Comparison of CLIP-based domain generalization methods. (a) and (b) require source domain data for fine-tuning, while (c) cho2023promptstyler and our method operate without such data. Our method uniquely leverages diverse domain information through style feature resampling and a text adapter.
  • Figure 2: Overall framework of PromptTA. Initially, the style generation process yields a fixed set of style features. These features are then enhanced through Style Feature Resampling to capture comprehensive domain knowledge. Both the original style features and resampled style features are utilized to train a linear classifier and a text adapter. Note that the encoders are derived from CLIP model radford2021learning.
  • Figure 3: The t-SNE van2008visualizing visualization results for original style features, style features from a single resampling instance, and class token features of PACS dataset. Different colors denote different classes.
  • Figure 4: Sensitivity analysis of hyperparameter $\alpha$ and $\beta$ on PACS dataset.