Table of Contents
Fetching ...

Minstrel: Structural Prompt Generation with Multi-Agents Coordination for Non-AI Experts

Ming Wang, Yuanzhong Liu, Xiaoyu Liang, Yijie Huang, Daling Wang, Xiaocui Yang, Sijia Shen, Shi Feng, Xiaoming Zhang, Chaofeng Guan, Yifei Zhang

TL;DR

This work introduces LangGPT, a programming-language–inspired structural prompt framework with a dual-layer design of modules and elements to improve generalization and reuse of prompts for non-AI experts. It further presents Minstrel, a multi-agent system with reflection that coordinates three groups—Analysis, Design, and Test—to automatically generate and refine LangGPT prompts. Empirical results show that structural prompts generated by LangGPT or manually crafted outperform baselines, with Minstrel prompts approaching or surpassing human-written prompts across a range of benchmarks and LLMs; a user study supports high ease of use. However, gains are reduced for lower-capacity LLMs, motivating future work to optimize prompts for weaker models and broaden usability.

Abstract

LLMs have demonstrated commendable performance across diverse domains. Nevertheless, formulating high-quality prompts to assist them in their work poses a challenge for non-AI experts. Existing research in prompt engineering suggests somewhat scattered optimization principles and designs empirically dependent prompt optimizers. Unfortunately, these endeavors lack a structural design, incurring high learning costs and it is not conducive to the iterative updating of prompts, especially for non-AI experts. Inspired by structured reusable programming languages, we propose LangGPT, a structural prompt design framework. Furthermore, we introduce Minstrel, a multi-generative agent system with reflection to automate the generation of structural prompts. Experiments and the case study illustrate that structural prompts generated by Minstrel or written manually significantly enhance the performance of LLMs. Furthermore, we analyze the ease of use of structural prompts through a user survey in our online community.

Minstrel: Structural Prompt Generation with Multi-Agents Coordination for Non-AI Experts

TL;DR

This work introduces LangGPT, a programming-language–inspired structural prompt framework with a dual-layer design of modules and elements to improve generalization and reuse of prompts for non-AI experts. It further presents Minstrel, a multi-agent system with reflection that coordinates three groups—Analysis, Design, and Test—to automatically generate and refine LangGPT prompts. Empirical results show that structural prompts generated by LangGPT or manually crafted outperform baselines, with Minstrel prompts approaching or surpassing human-written prompts across a range of benchmarks and LLMs; a user study supports high ease of use. However, gains are reduced for lower-capacity LLMs, motivating future work to optimize prompts for weaker models and broaden usability.

Abstract

LLMs have demonstrated commendable performance across diverse domains. Nevertheless, formulating high-quality prompts to assist them in their work poses a challenge for non-AI experts. Existing research in prompt engineering suggests somewhat scattered optimization principles and designs empirically dependent prompt optimizers. Unfortunately, these endeavors lack a structural design, incurring high learning costs and it is not conducive to the iterative updating of prompts, especially for non-AI experts. Inspired by structured reusable programming languages, we propose LangGPT, a structural prompt design framework. Furthermore, we introduce Minstrel, a multi-generative agent system with reflection to automate the generation of structural prompts. Experiments and the case study illustrate that structural prompts generated by Minstrel or written manually significantly enhance the performance of LLMs. Furthermore, we analyze the ease of use of structural prompts through a user survey in our online community.
Paper Structure (24 sections, 2 equations, 6 figures, 2 tables)

This paper contains 24 sections, 2 equations, 6 figures, 2 tables.

Figures (6)

  • Figure 1: Complaints voiced by Oliver when using LLMs. The name 'Oliver' is fictional, and 'Johnny' comes from 10.1145/3544548.3581388.
  • Figure 2: The overall framework of Minstrel, a structural prompt generation framework with multi-agents collaboration. There are three working groups: analyze group, design group, and test group. In design group, blue modules indicate activated modules, and green indicates modules that are not required for the current task and are not activated.
  • Figure 3: Results of different scales of Qwen. Each subfigure represents a different task, whereas the first subfigure represents the overall performance. 'Instruction' indicates that LangGPT prompts are not used while 'LangGPT' indicates that they are used.
  • Figure 4: A case of a flatterer. The responses of ChatGPT-3.5 to the user under three different prompts. Mingyuan University doesn't really exist.
  • Figure 5: The homepage of the community docments we constructed based on LangGPT and Minstrel
  • ...and 1 more figures