Conversational Prompt Engineering

Liat Ein-Dor; Orith Toledo-Ronen; Artem Spector; Shai Gretz; Lena Dankin; Alon Halfon; Yoav Katz; Noam Slonim

Conversational Prompt Engineering

Liat Ein-Dor, Orith Toledo-Ronen, Artem Spector, Shai Gretz, Lena Dankin, Alon Halfon, Yoav Katz, Noam Slonim

TL;DR

Prompt engineering is essential but labor intensive for tailoring LLM outputs. Conversational Prompt Engineering (CPE) uses a chat based interface to elicit user task preferences from unlabeled data and iteratively refine a final few shot prompt, reducing reliance on lengthy prompts. A three party chat design with context management and optional side chats enables data driven instruction refinement and output enhancement, with a dedicated API call set to manage the workflow. A user study on summarization tasks demonstrates that zero shot prompts generated by CPE can match the performance of longer few shot prompts, indicating substantial efficiency gains for repetitive, high volume tasks in enterprise settings.

Abstract

Prompts are how humans communicate with LLMs. Informative prompts are essential for guiding LLMs to produce the desired output. However, prompt engineering is often tedious and time-consuming, requiring significant expertise, limiting its widespread use. We propose Conversational Prompt Engineering (CPE), a user-friendly tool that helps users create personalized prompts for their specific tasks. CPE uses a chat model to briefly interact with users, helping them articulate their output preferences and integrating these into the prompt. The process includes two main stages: first, the model uses user-provided unlabeled data to generate data-driven questions and utilize user responses to shape the initial instruction. Then, the model shares the outputs generated by the instruction and uses user feedback to further refine the instruction and the outputs. The final result is a few-shot prompt, where the outputs approved by the user serve as few-shot examples. A user study on summarization tasks demonstrates the value of CPE in creating personalized, high-performing prompts. The results suggest that the zero-shot prompt obtained is comparable to its - much longer - few-shot counterpart, indicating significant savings in scenarios involving repetitive tasks with large text volumes.

Conversational Prompt Engineering

TL;DR

Abstract

Paper Structure (28 sections, 3 figures, 2 tables)

This paper contains 28 sections, 3 figures, 2 tables.

Introduction
Related Work
System Overview
Terminology
Overview
UI
Design and Implementation
Three-party Chat
The user:
The model:
The system:
Context Management
Chain of Thought (CoT)
API Calls
submit_message_to_user(msg):
...and 13 more sections

Figures (3)

Figure 1: CPE Workflow from the user's perspective. Each step can be a multi-turn conversation between the user and CPE.
Figure 2: Messages exchanged between the different actors in a chat with CPE, starting from (a) the user approves a suggested prompt, until (b) user approves the output of the first example. System messages are abbreviated (see full text in Appendix \ref{['sec:model_instruction']}). We use suggested_prompt, example_i and prompt_output_i to denote the prompt suggested by CPE, the $i^{th}$ example, and its prompt output respectively. All the messages below the dotted line are sent with a filtered context.
Figure 3: Evaluation results. The frequency of the three generated summaries across the three ranking categories: Best/Middle/Worst.

Conversational Prompt Engineering

TL;DR

Abstract

Conversational Prompt Engineering

Authors

TL;DR

Abstract

Table of Contents

Figures (3)