LLM Based Bayesian Optimization for Prompt Search
Adam Ballew, Jingbo Wang, Shaogang Ren
TL;DR
Prompt sensitivity in LLM-based text classification motivates automated, data-efficient prompt optimization. The authors propose BO-LLM, a Bayesian Optimization framework that uses a LLM-powered Gaussian Process surrogate and a UCB acquisition to search a discrete prompt space, augmented by an LLM-based expansion module to generate candidates. They formalize data representations, surrogate modeling, and acquisition, and demonstrate competitive results on the LIAR and ETHOS datasets compared with ProTeGi, including an extension to multi-turn clarification QA. The work highlights the potential for principled, sample-efficient prompt engineering while acknowledging challenges from evaluation noise, label-reversal issues, and stability-cost trade-offs, paving the way for scalable, automated prompt optimization in real-world settings.
Abstract
Bayesian Optimization (BO) has been widely used to efficiently optimize expensive black-box functions with limited evaluations. In this paper, we investigate the use of BO for prompt engineering to enhance text classification with Large Language Models (LLMs). We employ an LLM-powered Gaussian Process (GP) as the surrogate model to estimate the performance of different prompt candidates. These candidates are generated by an LLM through the expansion of a set of seed prompts and are subsequently evaluated using an Upper Confidence Bound (UCB) acquisition function in conjunction with the GP posterior. The optimization process iteratively refines the prompts based on a subset of the data, aiming to improve classification accuracy while reducing the number of API calls by leveraging the prediction uncertainty of the LLM-based GP. The proposed BO-LLM algorithm is evaluated on two datasets, and its advantages are discussed in detail in this paper.
