Auto-Prompt Generation is Not Robust: Prompt Optimization Driven by Pseudo Gradient
Zeru Shi, Zhenting Wang, Yongye Su, Weidi Luo, Hang Gao, Fan Yang, Ruixiang Tang, Yongfeng Zhang
TL;DR
This work identifies a robustness gap in automatic prompt generation for large language models and introduces PertBench, a benchmark spanning nine perturbations across three tasks to systematically evaluate prompt robustness. It then proposes PGO, a gradient-free prompt optimization framework that treats perturbations as pseudo-gradients to guide robust prompt refinement without requiring access to true model gradients. Empirical results across multiple backbones show that PGO consistently improves robustness under noisy inputs, outperforms existing baselines, and transfers across models, while ablation studies confirm the value of the pseudo-gradient component and a five-iteration optimization as optimal. The work highlights the practical importance of robustness in prompt design and provides a scalable, cost-aware approach for producing stable prompts in real-world, perturbed environments.
Abstract
While automatic prompt generation methods have recently received significant attention, their robustness remains poorly understood. In this paper, we introduce PertBench, a comprehensive benchmark dataset that includes a wide range of input perturbations, designed to systematically evaluate the robustness of current auto-prompting techniques. Our analysis reveals substantial vulnerabilities in existing prompt generation strategies, where even minor modifications to the prompt can lead to significant differences in model output. To address this issue, we propose PGO, a gradient-free prompt generation framework that leverages perturbation types as pseudo-gradient signals to guide LLMs in producing more robust prompts. In contrast to existing methods that assess prompt quality only on clean, well-structured inputs, our approach explicitly emphasizes robustness under noisy and perturbed conditions. Extensive experiments across diverse tasks and multiple LLMs show PGO consistently outperforms previous methods in maintaining performance under input perturbations.
