Table of Contents
Fetching ...

Symbolic Prompt Program Search: A Structure-Aware Approach to Efficient Compile-Time Prompt Optimization

Tobias Schnabel, Jennifer Neville

TL;DR

Symbolic Prompt Programs (SPPs) enable compile-time optimization of complex, structure-rich prompt programs by representing prompts as graph-structured abstractions. SAMMO introduces a structure-aware search framework with a rich mutation operator set and two search paradigms (enumerative and iterative) to explore both content and structure, outperforming prior methods across diverse LLMs. The framework achieves significant gains in instruction tuning, RAG pipeline tuning, and prompt compression, demonstrating that model- and task-specific prompt optimization is crucial. The work provides open-source tooling to accelerate prompt engineering and suggests future integration with run-time optimization and unsupervised settings.

Abstract

In many modern LLM applications, such as retrieval augmented generation, prompts have become programs themselves. In these settings, prompt programs are repeatedly called with different user queries or data instances. A big practical challenge is optimizing such prompt programs. Recent work has mostly focused on either simple prompt programs or assumed that the general structure of a prompt program is fixed. We introduce SAMMO, a framework to perform symbolic prompt program search for compile-time optimizations of prompt programs. SAMMO represents prompt programs on a symbolic level which allows for a rich set of transformations that can be searched over during optimization. We show that SAMMO generalizes previous methods and improves the performance of complex prompts on (1) instruction tuning, (2) RAG pipeline tuning, and (3) prompt compression, across several different LLMs. We make all code available open-source at https://github.com/microsoft/sammo .

Symbolic Prompt Program Search: A Structure-Aware Approach to Efficient Compile-Time Prompt Optimization

TL;DR

Symbolic Prompt Programs (SPPs) enable compile-time optimization of complex, structure-rich prompt programs by representing prompts as graph-structured abstractions. SAMMO introduces a structure-aware search framework with a rich mutation operator set and two search paradigms (enumerative and iterative) to explore both content and structure, outperforming prior methods across diverse LLMs. The framework achieves significant gains in instruction tuning, RAG pipeline tuning, and prompt compression, demonstrating that model- and task-specific prompt optimization is crucial. The work provides open-source tooling to accelerate prompt engineering and suggests future integration with run-time optimization and unsupervised settings.

Abstract

In many modern LLM applications, such as retrieval augmented generation, prompts have become programs themselves. In these settings, prompt programs are repeatedly called with different user queries or data instances. A big practical challenge is optimizing such prompt programs. Recent work has mostly focused on either simple prompt programs or assumed that the general structure of a prompt program is fixed. We introduce SAMMO, a framework to perform symbolic prompt program search for compile-time optimizations of prompt programs. SAMMO represents prompt programs on a symbolic level which allows for a rich set of transformations that can be searched over during optimization. We show that SAMMO generalizes previous methods and improves the performance of complex prompts on (1) instruction tuning, (2) RAG pipeline tuning, and (3) prompt compression, across several different LLMs. We make all code available open-source at https://github.com/microsoft/sammo .
Paper Structure (29 sections, 4 equations, 8 figures, 3 tables, 1 algorithm)

This paper contains 29 sections, 4 equations, 8 figures, 3 tables, 1 algorithm.

Figures (8)

  • Figure 1: Left: Symbolic prompt program (SPP) for a binary classification task, where each node is a function with attributes and dependencies (children). The example also shows how the SPP allows for structural changes (e.g., DeleteNode) and attribute-based changes (e.g., ChangeFormat) which, after applying, result in the mutated prompt (Right). These enable Sammo to explore a large set of possible prompt candidates automatically.
  • Figure 2: Sammo is a flexible framework for structured prompt optimization, and offers two classes of search algorithms depending on the set of mutators used.
  • Figure 3: Sammo consistently outperforms all other instruction tuning methods, for all of the backend LLMs.
  • Figure 4: Sammo efficiently improves baseline prompt accuracy across all semantic parsing datasets and backend LLMs with only 24 candidate evaluations.
  • Figure 5: There is only weak correlation between how well enumerative search candidates do across LLMs.
  • ...and 3 more figures