ReBaPL: Repulsive Bayesian Prompt Learning
Yassir Bendou, Omar Ezzahir, Eduardo Fernandes Montesuma, Gabriel Mahuas, Victoria Shevchenko, Mike Gartrell
TL;DR
Prompt learning for vision-language models often overfits and lacks robust OOD generalization. ReBaPL introduces Repulsive Bayesian Prompt Learning, using cyclical SGHMC with a representation-space repulsion term to sample from a multimodal posterior over prompts, thereby exploring multiple high-quality modes. Distances between representation distributions via $MMD$ or $W_2$ drive repulsion, enabling diverse, functionally distinct prompts without premature mode collapse. The approach is plug-and-play, improving base-to-novel, cross-dataset, and domain generalization across multiple datasets when extended to MaPLe and MMRL, and ablations confirm the benefit of repulsion and multimodal exploration.
Abstract
Prompt learning has emerged as an effective technique for fine-tuning large-scale foundation models for downstream tasks. However, conventional prompt tuning methods are prone to overfitting and can struggle with out-of-distribution generalization. To address these limitations, Bayesian prompt learning has been proposed, which frames prompt optimization as a Bayesian inference problem to enhance robustness. This paper introduces Repulsive Bayesian Prompt Learning (ReBaPL), a novel method for Bayesian prompt learning, designed to efficiently explore the complex and often multimodal posterior landscape of prompts. Our method integrates a cyclical step-size schedule with a stochastic gradient Hamiltonian Monte Carlo (SGHMC) algorithm, enabling alternating phases of exploration to discover new modes, and exploitation to refine existing modes. Furthermore, we introduce a repulsive force derived from a potential function over probability metrics (including Maximum Mean Discrepancy and Wasserstein distance) computed on the distributions of representations produced by different prompts. This representation-space repulsion diversifies exploration and prevents premature collapse to a single mode. Our approach allows for a more comprehensive characterization of the prompt posterior distribution, leading to improved generalization. In contrast to prior Bayesian prompt learning methods, our method provides a modular plug-and-play Bayesian extension of any existing prompt learning method based on maximum likelihood estimation. We demonstrate the efficacy of ReBaPL on several benchmark datasets, showing superior performance over state-of-the-art methods for prompt learning.
