Any-Shift Prompting for Generalization over Distributions
Zehao Xiao, Jiayi Shen, Mohammad Mahdi Derakhshani, Shengcai Liao, Cees G. M. Snoek
TL;DR
The paper tackles prompt-learning generalization under distribution shifts in image-language models like CLIP. It introduces any-shift prompting, a hierarchical probabilistic framework that links training and test distributions via train and test prompts and a transformer-based inference network, and it employs a pseudo-shift training mechanism to generate test-specific prompts in a single forward pass without test-time fine-tuning ($ELBO$). The approach leverages variational inference to encourage informative test prompts and enables predictions by sampling prompts from learned distributions. Empirical results across 23 datasets show robust generalization across covariate, label, conditional, concept, and joint shifts, outperforming or matching existing prompt-learning baselines while avoiding test-time optimization.
Abstract
Image-language models with prompt learning have shown remarkable advances in numerous downstream vision tasks. Nevertheless, conventional prompt learning methods overfit their training distribution and lose the generalization ability on test distributions. To improve generalization across various distribution shifts, we propose any-shift prompting: a general probabilistic inference framework that considers the relationship between training and test distributions during prompt learning. We explicitly connect training and test distributions in the latent space by constructing training and test prompts in a hierarchical architecture. Within this framework, the test prompt exploits the distribution relationships to guide the generalization of the CLIP image-language model from training to any test distribution. To effectively encode the distribution information and their relationships, we further introduce a transformer inference network with a pseudo-shift training mechanism. The network generates the tailored test prompt with both training and test information in a feedforward pass, avoiding extra training costs at test time. Extensive experiments on twenty-three datasets demonstrate the effectiveness of any-shift prompting on the generalization over various distribution shifts.
