SIPDO: Closed-Loop Prompt Optimization via Synthetic Data Feedback
Yaoning Yu, Ye Yu, Peiyan Zhang, Kai Wei, Haojing Luo, Haohan Wang
TL;DR
SIPDO introduces a closed-loop, data-centric approach to prompt optimization by jointly training a Synthetic Data Generator and an Auto Prompt Optimizer to identify and fix prompt weaknesses through progressively harder synthetic examples. The framework reframes prompt tuning as an adaptive curriculum that stress-tests prompts and guides iterative rewrites, supported by a theoretical guarantee on worst-case error under regularised data generation. Empirically, SIPDO yields robust improvements across diverse reasoning benchmarks (e.g., BIG-Bench, FOLIO, PrOntoQA, ProofWriter, MMLU) and maintains competitiveness on challenging tasks like ProofWriter, outperforming several leading baselines. The work demonstrates the practical value of integrating synthetic data generation with prompt refinement to enhance robustness and reliability of LLM-driven systems, with potential for domain-specific extensions and fully automated continual learning.
Abstract
Prompt quality plays a critical role in the performance of large language models (LLMs), motivating a growing body of work on prompt optimization. Most existing methods optimize prompts over a fixed dataset, assuming static input distributions and offering limited support for iterative improvement. We introduce SIPDO (Self-Improving Prompts through Data-Augmented Optimization), a closed-loop framework for prompt learning that integrates synthetic data generation into the optimization process. SIPDO couples a synthetic data generator with a prompt optimizer, where the generator produces new examples that reveal current prompt weaknesses and the optimizer incrementally refines the prompt in response. This feedback-driven loop enables systematic improvement of prompt performance without assuming access to external supervision or new tasks. Experiments across question answering and reasoning benchmarks show that SIPDO outperforms standard prompt tuning methods, highlighting the value of integrating data synthesis into prompt learning workflows.
