Language Model Prompt Selection via Simulation Optimization

Haoting Zhang; Jinghai He; Rhonda Righter; Zeyu Zheng

Language Model Prompt Selection via Simulation Optimization

Haoting Zhang, Jinghai He, Rhonda Righter, Zeyu Zheng

TL;DR

This paper introduces a two-stage framework that uses simulation optimization to select prompts for pre-trained language models without additional fine-tuning. It first builds a finite, informative soft-prompt set via a text autoencoder and PCA, then uses a Bayesian surrogate with an acquisition function to sequentially evaluate prompts, with a refinement step via Projection Stochastic Kriging to improve latent mappings. The authors demonstrate that a Bayesian neural network surrogate with M-UCB or PR-M-UCB acquisition outperforms direct latent-space search under finite budgets and provide consistency proofs and practical guidance, including hyperparameter tuning through stochastic kriging. The approach is practical for small organizations seeking efficient, model-agnostic prompt optimization and offers broad applicability to other LM-based management tasks. Key contributions include a rigorously framed two-stage framework, consistency guarantees, and a thorough empirical comparison of surrogate models and acquisition strategies. $v(z)$, $h( ilde{y},y)$, and acquisition terms such as $ ext{M-UCB}$ are central to the optimization process, and latent-to-text mappings are enabled by a text autoencoder coupled with PCA.

Abstract

With the advancement in generative language models, the selection of prompts has gained significant attention in recent years. A prompt is an instruction or description provided by the user, serving as a guide for the generative language model in content generation. Despite existing methods for prompt selection that are based on human labor, we consider facilitating this selection through simulation optimization, aiming to maximize a pre-defined score for the selected prompt. Specifically, we propose a two-stage framework. In the first stage, we determine a feasible set of prompts in sufficient numbers, where each prompt is represented by a moderate-dimensional vector. In the subsequent stage for evaluation and selection, we construct a surrogate model of the score regarding the moderate-dimensional vectors that represent the prompts. We propose sequentially selecting the prompt for evaluation based on this constructed surrogate model. We prove the consistency of the sequential evaluation procedure in our framework. We also conduct numerical experiments to demonstrate the efficacy of our proposed framework, providing practical instructions for implementation.

Language Model Prompt Selection via Simulation Optimization

TL;DR

, and acquisition terms such as

are central to the optimization process, and latent-to-text mappings are enabled by a text autoencoder coupled with PCA.

Abstract

Paper Structure (35 sections, 6 theorems, 78 equations, 8 figures, 1 table, 4 algorithms)

This paper contains 35 sections, 6 theorems, 78 equations, 8 figures, 1 table, 4 algorithms.

Introduction
Introduction to Our Method and Results
Literature Review
Problem Description
Search Stage
Prompt Vector Representation
Soft Prompt Set Construction
Evaluation and Selection Stage
Warm-up Step
Sequential Evaluation Step
Bayesian Parametric Surrogate Model
Acquisition Function & Optimization
Refinement
Experiments
Surrogate Model Comparison
...and 20 more sections

Key Result

Theorem 1

Let $z^*\in\arg\max_{z_n\in\mathcal{Z}}v\left(z_n\right)$ be the prompt with the highest mean score and $\widehat{z}^*$ be the selected soft prompt by Algorithm alg.1. Under Assumption assumption.1,

Figures (8)

Figure 1: An illustration of different prompts leading to different outputs on a common subject.
Figure 2: Our framework of prompt selection.
Figure 3: Experimental results for approximating the mean score with the soft prompts using different Bayesian parametric models. The task is word sorting and the generative language model is text-davinci-003.
Figure 4: Experimental results for approximating the mean score with the soft prompts using different Bayesian parametric models. The task is word sorting and the generative language model is gpt-3.5-turbo.
Figure 5: Experimental results for comparison between two acquisition functions: M-UCB and PR-M-UCB (number of starting points, number of gradient ascent iterations). The task is finding the largest animals given the names and the generative language model is gpt-3.5-turbo.
...and 3 more figures

Theorems & Definitions (10)

Example 1: Gaussian Process
Example 2: Bayesian Neural Network
Theorem 1
Proposition 1
Theorem 2
Proposition 2
Proposition 3
Theorem 3
Definition 1: Token
Definition 2: Hamiltonian

Language Model Prompt Selection via Simulation Optimization

TL;DR

Abstract

Language Model Prompt Selection via Simulation Optimization

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (8)

Theorems & Definitions (10)