Table of Contents
Fetching ...

Conversational Prompt Engineering

Liat Ein-Dor, Orith Toledo-Ronen, Artem Spector, Shai Gretz, Lena Dankin, Alon Halfon, Yoav Katz, Noam Slonim

TL;DR

Prompt engineering is essential but labor intensive for tailoring LLM outputs. Conversational Prompt Engineering (CPE) uses a chat based interface to elicit user task preferences from unlabeled data and iteratively refine a final few shot prompt, reducing reliance on lengthy prompts. A three party chat design with context management and optional side chats enables data driven instruction refinement and output enhancement, with a dedicated API call set to manage the workflow. A user study on summarization tasks demonstrates that zero shot prompts generated by CPE can match the performance of longer few shot prompts, indicating substantial efficiency gains for repetitive, high volume tasks in enterprise settings.

Abstract

Prompts are how humans communicate with LLMs. Informative prompts are essential for guiding LLMs to produce the desired output. However, prompt engineering is often tedious and time-consuming, requiring significant expertise, limiting its widespread use. We propose Conversational Prompt Engineering (CPE), a user-friendly tool that helps users create personalized prompts for their specific tasks. CPE uses a chat model to briefly interact with users, helping them articulate their output preferences and integrating these into the prompt. The process includes two main stages: first, the model uses user-provided unlabeled data to generate data-driven questions and utilize user responses to shape the initial instruction. Then, the model shares the outputs generated by the instruction and uses user feedback to further refine the instruction and the outputs. The final result is a few-shot prompt, where the outputs approved by the user serve as few-shot examples. A user study on summarization tasks demonstrates the value of CPE in creating personalized, high-performing prompts. The results suggest that the zero-shot prompt obtained is comparable to its - much longer - few-shot counterpart, indicating significant savings in scenarios involving repetitive tasks with large text volumes.

Conversational Prompt Engineering

TL;DR

Prompt engineering is essential but labor intensive for tailoring LLM outputs. Conversational Prompt Engineering (CPE) uses a chat based interface to elicit user task preferences from unlabeled data and iteratively refine a final few shot prompt, reducing reliance on lengthy prompts. A three party chat design with context management and optional side chats enables data driven instruction refinement and output enhancement, with a dedicated API call set to manage the workflow. A user study on summarization tasks demonstrates that zero shot prompts generated by CPE can match the performance of longer few shot prompts, indicating substantial efficiency gains for repetitive, high volume tasks in enterprise settings.

Abstract

Prompts are how humans communicate with LLMs. Informative prompts are essential for guiding LLMs to produce the desired output. However, prompt engineering is often tedious and time-consuming, requiring significant expertise, limiting its widespread use. We propose Conversational Prompt Engineering (CPE), a user-friendly tool that helps users create personalized prompts for their specific tasks. CPE uses a chat model to briefly interact with users, helping them articulate their output preferences and integrating these into the prompt. The process includes two main stages: first, the model uses user-provided unlabeled data to generate data-driven questions and utilize user responses to shape the initial instruction. Then, the model shares the outputs generated by the instruction and uses user feedback to further refine the instruction and the outputs. The final result is a few-shot prompt, where the outputs approved by the user serve as few-shot examples. A user study on summarization tasks demonstrates the value of CPE in creating personalized, high-performing prompts. The results suggest that the zero-shot prompt obtained is comparable to its - much longer - few-shot counterpart, indicating significant savings in scenarios involving repetitive tasks with large text volumes.
Paper Structure (28 sections, 3 figures, 2 tables)

This paper contains 28 sections, 3 figures, 2 tables.

Figures (3)

  • Figure 1: CPE Workflow from the user's perspective. Each step can be a multi-turn conversation between the user and CPE.
  • Figure 2: Messages exchanged between the different actors in a chat with CPE, starting from (a) the user approves a suggested prompt, until (b) user approves the output of the first example. System messages are abbreviated (see full text in Appendix \ref{['sec:model_instruction']}). We use suggested_prompt, example_i and prompt_output_i to denote the prompt suggested by CPE, the $i^{th}$ example, and its prompt output respectively. All the messages below the dotted line are sent with a filtered context.
  • Figure 3: Evaluation results. The frequency of the three generated summaries across the three ranking categories: Best/Middle/Worst.