Bayesian Optimization in Language Space: An Eval-Efficient AI Self-Improvement Framework
Enoch Hyunwook Kang, Hema Yoganarasimhan
TL;DR
This work addresses the bottleneck of evaluation efficiency in self-improving AI by reframing prompt optimization as language-space Bayesian Optimization. It introduces TextGrad-Best-of-N Bayesian Optimization (T-BoN BO), which combines textual gradients (TextGrad) with Best-of-N gradient selection to mimic the UCB acquisition in language space without explicit surrogates. The authors prove that the Best-of-N gradient asymptotically aligns with the UCB gradient, yielding evaluation-efficient search, and validate the approach empirically on ad-optimization tasks using LLM-based persona simulations, where T-BoN BO outperforms state-of-the-art baselines like Best-of-N and GEPA. The results suggest that evaluation-efficient self-improvement can be achieved in practice, enabling faster and more robust alignment of AI-generated content with target user preferences across diverse scenarios, even with limited contextual information.
Abstract
Large Language Models (LLMs) have recently enabled self-improving AI, i.e., AI that iteratively generates, evaluates, and refines its own outcomes. Recent studies have shown that self-improving AI focusing on prompt optimization can outperform state-of-the-art reinforcement-learning fine-tuned LLMs. Here, their `performance' is typically measured by query efficiency - the number of LLM-generated solution samples required to meet a certain performance threshold. However, in many societal applications, the primary limitation is not generating new solutions but evaluating them. For instance, evaluating an ad's effectiveness requires significant human feedback, which is far more costly and time-consuming than generating a candidate ad. To optimize for the evaluation efficiency objective, a natural approach is to extend Bayesian Optimization (BO), a framework proven optimal for evaluation efficiency, to the language domain. However, the difficulty of directly estimating suitable acquisition functions in LLMs' minds makes this extension challenging. This paper overcomes this challenge by proving that the combination of the simple and widely used Best-of-N selection strategy and simple textual gradients (i.e., textual edits from a critic model) statistically emulates the behavior of the gradients on the canonical UCB acquisition function, which induces optimal exploration in terms of evaluation efficiency. Based on this result, we propose TextGrad-Best-of-N Bayesian Optimization (T-BoN BO), a simple and eval-efficient language-space Bayesian optimization framework for AI self-improvement. We also empirically validate T-BoN BO by applying it to automated ad alignment tasks for persona distribution, demonstrating its superior performance compared to popular state-of-the-art baselines.
