Table of Contents
Fetching ...

ReBaPL: Repulsive Bayesian Prompt Learning

Yassir Bendou, Omar Ezzahir, Eduardo Fernandes Montesuma, Gabriel Mahuas, Victoria Shevchenko, Mike Gartrell

TL;DR

Prompt learning for vision-language models often overfits and lacks robust OOD generalization. ReBaPL introduces Repulsive Bayesian Prompt Learning, using cyclical SGHMC with a representation-space repulsion term to sample from a multimodal posterior over prompts, thereby exploring multiple high-quality modes. Distances between representation distributions via $MMD$ or $W_2$ drive repulsion, enabling diverse, functionally distinct prompts without premature mode collapse. The approach is plug-and-play, improving base-to-novel, cross-dataset, and domain generalization across multiple datasets when extended to MaPLe and MMRL, and ablations confirm the benefit of repulsion and multimodal exploration.

Abstract

Prompt learning has emerged as an effective technique for fine-tuning large-scale foundation models for downstream tasks. However, conventional prompt tuning methods are prone to overfitting and can struggle with out-of-distribution generalization. To address these limitations, Bayesian prompt learning has been proposed, which frames prompt optimization as a Bayesian inference problem to enhance robustness. This paper introduces Repulsive Bayesian Prompt Learning (ReBaPL), a novel method for Bayesian prompt learning, designed to efficiently explore the complex and often multimodal posterior landscape of prompts. Our method integrates a cyclical step-size schedule with a stochastic gradient Hamiltonian Monte Carlo (SGHMC) algorithm, enabling alternating phases of exploration to discover new modes, and exploitation to refine existing modes. Furthermore, we introduce a repulsive force derived from a potential function over probability metrics (including Maximum Mean Discrepancy and Wasserstein distance) computed on the distributions of representations produced by different prompts. This representation-space repulsion diversifies exploration and prevents premature collapse to a single mode. Our approach allows for a more comprehensive characterization of the prompt posterior distribution, leading to improved generalization. In contrast to prior Bayesian prompt learning methods, our method provides a modular plug-and-play Bayesian extension of any existing prompt learning method based on maximum likelihood estimation. We demonstrate the efficacy of ReBaPL on several benchmark datasets, showing superior performance over state-of-the-art methods for prompt learning.

ReBaPL: Repulsive Bayesian Prompt Learning

TL;DR

Prompt learning for vision-language models often overfits and lacks robust OOD generalization. ReBaPL introduces Repulsive Bayesian Prompt Learning, using cyclical SGHMC with a representation-space repulsion term to sample from a multimodal posterior over prompts, thereby exploring multiple high-quality modes. Distances between representation distributions via or drive repulsion, enabling diverse, functionally distinct prompts without premature mode collapse. The approach is plug-and-play, improving base-to-novel, cross-dataset, and domain generalization across multiple datasets when extended to MaPLe and MMRL, and ablations confirm the benefit of repulsion and multimodal exploration.

Abstract

Prompt learning has emerged as an effective technique for fine-tuning large-scale foundation models for downstream tasks. However, conventional prompt tuning methods are prone to overfitting and can struggle with out-of-distribution generalization. To address these limitations, Bayesian prompt learning has been proposed, which frames prompt optimization as a Bayesian inference problem to enhance robustness. This paper introduces Repulsive Bayesian Prompt Learning (ReBaPL), a novel method for Bayesian prompt learning, designed to efficiently explore the complex and often multimodal posterior landscape of prompts. Our method integrates a cyclical step-size schedule with a stochastic gradient Hamiltonian Monte Carlo (SGHMC) algorithm, enabling alternating phases of exploration to discover new modes, and exploitation to refine existing modes. Furthermore, we introduce a repulsive force derived from a potential function over probability metrics (including Maximum Mean Discrepancy and Wasserstein distance) computed on the distributions of representations produced by different prompts. This representation-space repulsion diversifies exploration and prevents premature collapse to a single mode. Our approach allows for a more comprehensive characterization of the prompt posterior distribution, leading to improved generalization. In contrast to prior Bayesian prompt learning methods, our method provides a modular plug-and-play Bayesian extension of any existing prompt learning method based on maximum likelihood estimation. We demonstrate the efficacy of ReBaPL on several benchmark datasets, showing superior performance over state-of-the-art methods for prompt learning.

Paper Structure

This paper contains 19 sections, 24 equations, 7 figures, 5 tables, 1 algorithm.

Figures (7)

  • Figure 1: Overview of our proposed ReBaPL approach, in a multi-modal prompt learning setting. Text and image encoders receive text ($\mathbf{W_0}$) and image ($\mathbf{E_0}$) embeddings as input, combined with learnable tokens of context prompts ($\mathbf{P}$). The terms pertaining to the exploration and sampling stages and the repulsion force are colored in blue, red and green respectively.
  • Figure 2: Conceptual illustration of the benefits of repulsive MCMC on a mixture of Gaussian distributions. In (a), we show the potential $U(\theta)$ alongside the vector field $\nabla U(\theta)$. The blue point, $\theta_{k,T}^{(c-1)}$ represents the minimum of $\theta \mapsto U(\theta)$ obtained through . In (b), we show the repulsion potential $V(\theta,\theta_{k,T}^{(c-1)})$ alongside the vector field $F(\theta,\theta_{k,T}^{(c-1)}) = \nabla_{\theta}V(\theta,\theta_{k,T}^{(c-1)})$. As we show in (c) the repulsion vector field changes the initial vector field, pushing samples away from the sample of the previous cycle.
  • Figure 3: As we show in (a) and (b), we encourage mode exploration by driving particles from the current cycle, $\theta_{k,t}^{(c)}$, away from those of the previous cycle, $\theta_{k,T}^{(c-1)}$.
  • Figure 4: SGMCMC with very high repulsion strength $\xi$. In this case the repulsive force dominates the sampling trajectory.
  • Figure 5: Wasserstein distance matrix after training on Eurosat, with and without repulsion. Average Wasserstein distance with repulsion is greater.
  • ...and 2 more figures

Theorems & Definitions (5)

  • Remark
  • Remark
  • Remark
  • Remark
  • Remark