GPS: General Per-Sample Prompter

Pawel Batorski; Paul Swoboda

GPS: General Per-Sample Prompter

Pawel Batorski, Paul Swoboda

TL;DR

GPS introduces a general-purpose, per-sample prompter trained with reinforcement learning to generate input-specific prompts for unseen tasks without task-specific data. It combines a trainable Prompt Generator, a frozen Evaluator Model, two regularization schemes to prevent leakage, and Minimum Bayes Risk decoding to stabilize inference, achieving competitive results across summarization, simplification, classification, and GSM8K. The work emphasizes zero-shot adaptability and cross-task generalization, showing that per-input prompts can outperform many task-specific prompting methods without large, curated datasets. This paradigm points to practical, on-demand prompting for diverse NLP tasks and motivates further refinements in regularization and sample-efficient training.

Abstract

LLMs are sensitive to prompting, with task performance often hinging on subtle, sometimes imperceptible variations in phrasing. As a result, crafting effective prompts manually remains challenging and time-consuming. Recent automatic prompting methods mitigate this difficulty but face three key limitations: (i) for each new task, they require large datasets to train good prompts;(ii) they rely on costly optimization loops that may take hours; (iii)they typically produce a single task-level prompt that does not adapt to the individual input problem to be solved. We propose GPS, the first general-purpose, per-sample prompting method. Without any task-specific tuning, GPS generates a tailored prompt for each unseen input, improving performance across diverse tasks. The prompter is trained with reinforcement learning on a suite of training tasks and includes a novel regularization for effectively adapting to per-sample prompting. Finally, we employ Minimum Bayes Risk decoding to stabilize inference. Empirically, GPS demonstrates competitive performance: we attain second best results among baselines on text simplification, third best results on summarization and on-par results on classification, while not training on any of these tasks, in contrast to the baselines. For in-domain prompting, we obtain sota on GSM8K. Our work shows the potential of a novel and effective paradigm for automatic prompting: generating adaptive, input-specific prompts without extensive optimization and without access to a task-specific training set. Our code is available at https://github.com/Batorskq/GPS.

GPS: General Per-Sample Prompter

TL;DR

Abstract

GPS: General Per-Sample Prompter

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (4)