Table of Contents
Fetching ...

LangGPT: Rethinking Structured Reusable Prompt Design Framework for LLMs from the Programming Language

Ming Wang, Yuanzhong Liu, Xiaoyu Liang, Songlian Li, Yijie Huang, Xiaoming Zhang, Sijia Shen, Chaofeng Guan, Daling Wang, Shi Feng, Huaiwen Zhang, Yifei Zhang, Minghui Zheng, Chi Zhang

TL;DR

Prompt design for LLMs remains fragmented and difficult for non-AI experts. LangGPT introduces a dual-layer, programming-language inspired framework to improve generalization, reusability, and iterative updating. The framework structures prompts as modules and internal elements, with inherent and extension modules, and provides Markdown/JSON representations for reuse. Empirical results show LangGPT improves task performance across multiple LLMs and benchmarks, with a positive user survey indicating ease of use and a case study illustrating richer, more nuanced responses; limitations include reduced gains for low-performance models and future work on tool integration and optimization.

Abstract

LLMs have demonstrated commendable performance across diverse domains. Nevertheless, formulating high-quality prompts to instruct LLMs proficiently poses a challenge for non-AI experts. Existing research in prompt engineering suggests somewhat scattered optimization principles and designs empirically dependent prompt optimizers. Unfortunately, these endeavors lack a structured design template, incurring high learning costs and resulting in low reusability. In addition, it is not conducive to the iterative updating of prompts. Inspired by structured reusable programming languages, we propose LangGPT, a dual-layer prompt design framework as the programming language for LLMs. LangGPT has an easy-to-learn normative structure and provides an extended structure for migration and reuse. Experiments illustrate that LangGPT significantly enhances the performance of LLMs. Moreover, the case study shows that LangGPT leads LLMs to generate higher-quality responses. Furthermore, we analyzed the ease of use and reusability of LangGPT through a user survey in our online community.

LangGPT: Rethinking Structured Reusable Prompt Design Framework for LLMs from the Programming Language

TL;DR

Prompt design for LLMs remains fragmented and difficult for non-AI experts. LangGPT introduces a dual-layer, programming-language inspired framework to improve generalization, reusability, and iterative updating. The framework structures prompts as modules and internal elements, with inherent and extension modules, and provides Markdown/JSON representations for reuse. Empirical results show LangGPT improves task performance across multiple LLMs and benchmarks, with a positive user survey indicating ease of use and a case study illustrating richer, more nuanced responses; limitations include reduced gains for low-performance models and future work on tool integration and optimization.

Abstract

LLMs have demonstrated commendable performance across diverse domains. Nevertheless, formulating high-quality prompts to instruct LLMs proficiently poses a challenge for non-AI experts. Existing research in prompt engineering suggests somewhat scattered optimization principles and designs empirically dependent prompt optimizers. Unfortunately, these endeavors lack a structured design template, incurring high learning costs and resulting in low reusability. In addition, it is not conducive to the iterative updating of prompts. Inspired by structured reusable programming languages, we propose LangGPT, a dual-layer prompt design framework as the programming language for LLMs. LangGPT has an easy-to-learn normative structure and provides an extended structure for migration and reuse. Experiments illustrate that LangGPT significantly enhances the performance of LLMs. Moreover, the case study shows that LangGPT leads LLMs to generate higher-quality responses. Furthermore, we analyzed the ease of use and reusability of LangGPT through a user survey in our online community.
Paper Structure (17 sections, 4 figures, 6 tables)

This paper contains 17 sections, 4 figures, 6 tables.

Figures (4)

  • Figure 1: Analogy between programming language and natural language prompt. The analogy between the two types of languages was analyzed in terms of their hierarchical structure. Circles of different sizes indicate different layers. Smaller circles indicate closer to the inner layers, corresponding to darker colors.
  • Figure 2: A case of a flatterer. The responses of ChatGPT-3.5 to the user under three different prompts. Mingyuan University doesn't really exist.
  • Figure 3: Results of different scales of Qwen. Each subfigure represents a different task, whereas the first subfigure represents the overall performance. 'Instruction' indicates that LangGPT prompts are not used while 'LangGPT' indicates that they are used.
  • Figure 4: Ratings on ease of use in user survey. The lowest score is 0, which means very difficult to use, and the highest score is 5, which means very easy to use. The ":" is used to separate scores and percentages.