Table of Contents
Fetching ...

Beyond Elicitation: Provision-based Prompt Optimization for Knowledge-Intensive Tasks

Yunzhe Xu, Zhuosheng Zhang, Zhe Liu

TL;DR

The paper tackles the limitation that elicitation-based prompt optimization cannot supply domain knowledge for knowledge-intensive tasks. It introduces KPPO, a knowledge-provision-based framework that uses knowledge-gap filling, batch-wise dual-objective evaluation, and adaptive pruning to provide domain knowledge in prompts. It demonstrates on 15 knowledge-intensive benchmarks across finance, law, and medicine that KPPO yields approximately $6\%$ average improvements over strongest baselines and reduces tokens by up to $29\%$ with pruning. The results indicate a shift from parameter elicitation to external knowledge provisioning, enabling better performance in domains where factual accuracy and domain-specific terminology are critical.

Abstract

While prompt optimization has emerged as a critical technique for enhancing language model performance, existing approaches primarily focus on elicitation-based strategies that search for optimal prompts to activate models' capabilities. These methods exhibit fundamental limitations when addressing knowledge-intensive tasks, as they operate within fixed parametric boundaries rather than providing the factual knowledge, terminology precision, and reasoning patterns required in specialized domains. To address these limitations, we propose Knowledge-Provision-based Prompt Optimization (KPPO), a framework that reformulates prompt optimization as systematic knowledge integration rather than potential elicitation. KPPO introduces three key innovations: 1) a knowledge gap filling mechanism for knowledge gap identification and targeted remediation; 2) a batch-wise candidate evaluation approach that considers both performance improvement and distributional stability; 3) an adaptive knowledge pruning strategy that balances performance and token efficiency, reducing up to 29% token usage. Extensive evaluation on 15 knowledge-intensive benchmarks from various domains demonstrates KPPO's superiority over elicitation-based methods, with an average performance improvement of ~6% over the strongest baseline while achieving comparable or lower token consumption. Code at: https://github.com/xyz9911/KPPO.

Beyond Elicitation: Provision-based Prompt Optimization for Knowledge-Intensive Tasks

TL;DR

The paper tackles the limitation that elicitation-based prompt optimization cannot supply domain knowledge for knowledge-intensive tasks. It introduces KPPO, a knowledge-provision-based framework that uses knowledge-gap filling, batch-wise dual-objective evaluation, and adaptive pruning to provide domain knowledge in prompts. It demonstrates on 15 knowledge-intensive benchmarks across finance, law, and medicine that KPPO yields approximately average improvements over strongest baselines and reduces tokens by up to with pruning. The results indicate a shift from parameter elicitation to external knowledge provisioning, enabling better performance in domains where factual accuracy and domain-specific terminology are critical.

Abstract

While prompt optimization has emerged as a critical technique for enhancing language model performance, existing approaches primarily focus on elicitation-based strategies that search for optimal prompts to activate models' capabilities. These methods exhibit fundamental limitations when addressing knowledge-intensive tasks, as they operate within fixed parametric boundaries rather than providing the factual knowledge, terminology precision, and reasoning patterns required in specialized domains. To address these limitations, we propose Knowledge-Provision-based Prompt Optimization (KPPO), a framework that reformulates prompt optimization as systematic knowledge integration rather than potential elicitation. KPPO introduces three key innovations: 1) a knowledge gap filling mechanism for knowledge gap identification and targeted remediation; 2) a batch-wise candidate evaluation approach that considers both performance improvement and distributional stability; 3) an adaptive knowledge pruning strategy that balances performance and token efficiency, reducing up to 29% token usage. Extensive evaluation on 15 knowledge-intensive benchmarks from various domains demonstrates KPPO's superiority over elicitation-based methods, with an average performance improvement of ~6% over the strongest baseline while achieving comparable or lower token consumption. Code at: https://github.com/xyz9911/KPPO.

Paper Structure

This paper contains 19 sections, 8 equations, 19 figures, 2 tables, 3 algorithms.

Figures (19)

  • Figure 1: Comparative analysis of prompt optimization performances on 15 knowledge-intensive tasks from various domains. Traditional elicitation-based methods achieve marginal or even negative improvements, while KPPO demonstrates substantial improvements (average +6%) while achieves comparable or enhanced efficiency.
  • Figure 2: Visualization of the failure of elicitation-based prompt optimization to fill LLM's knowledge gap in specific domains. Despite achieving improved accuracy on the validation split, the optimized prompt fails to provide sufficient domain knowledge, resulting in continued errors on the original failure cases. The optimized prompts capture surface-level patterns rather than providing the substantive domain knowledge required to resolve the failure cases.
  • Figure 3: Analysis of key points within the optimized prompt and learning gain across knowledge-intensive tasks.
  • Figure 4: Overview of KPPO. Given task mistakes, the optimizer LLM generates "gradients" by analyzing the original prompt's limitations, producing explanations of failures, identifying knowledge gaps, and suggesting targeted modifications. The framework then integrates these gradients to generate candidate prompts, which undergo an alternative pruning procedure to avoid over-lengthened prompt. Prompt candidates are filtered with a batch-wise dual-objective evaluation that jointly considers performance improvement on recent training instances and distribution stability to ensure robust knowledge integration.
  • Figure 5: Inference token efficiency vs. performance trade-off of adaptive knowledge pruning on 15 tasks.
  • ...and 14 more figures