Table of Contents
Fetching ...

Efficient Prompting Methods for Large Language Models: A Survey

Kaiyan Chang, Songcheng Xu, Chenglong Wang, Yingfeng Luo, Xiaoqian Liu, Tong Xiao, Jingbo Zhu

TL;DR

This survey defines and systematizes efficient prompting for large language models by separating methods that reduce human effort (automatic prompt engineering) from those that cut computational costs (prompt compression). It introduces a mathematical framing for both prompt design and compression, then details instruction design, CoT optimization, and various editing, sampling, and feedback approaches, followed by continuous and discrete compression techniques including internalization, encoding, pruning, and summarization. The work highlights how these strategies can be combined to achieve comparable or superior performance with fewer resources, and it catalogs open-source tools to aid practitioners. By outlining key challenges and future directions, it aims to guide researchers toward robust, scalable, and interpretable resource-efficient prompting suitable for real-world deployment and progression toward AGI. The contributions include a first comprehensive, math-grounded taxonomy of efficient prompting methods and a curated resource appendix of open-source projects.

Abstract

Prompting is a mainstream paradigm for adapting large language models to specific natural language processing tasks without modifying internal parameters. Therefore, detailed supplementary knowledge needs to be integrated into external prompts, which inevitably brings extra human efforts and computational burdens for practical applications. As an effective solution to mitigate resource consumption, Efficient Prompting Methods have attracted a wide range of attention. We provide mathematical expressions at a high level to deeply discuss Automatic Prompt Engineering for different prompt components and Prompt Compression in continuous and discrete spaces. Finally, we highlight promising future directions to inspire researchers interested in this field.

Efficient Prompting Methods for Large Language Models: A Survey

TL;DR

This survey defines and systematizes efficient prompting for large language models by separating methods that reduce human effort (automatic prompt engineering) from those that cut computational costs (prompt compression). It introduces a mathematical framing for both prompt design and compression, then details instruction design, CoT optimization, and various editing, sampling, and feedback approaches, followed by continuous and discrete compression techniques including internalization, encoding, pruning, and summarization. The work highlights how these strategies can be combined to achieve comparable or superior performance with fewer resources, and it catalogs open-source tools to aid practitioners. By outlining key challenges and future directions, it aims to guide researchers toward robust, scalable, and interpretable resource-efficient prompting suitable for real-world deployment and progression toward AGI. The contributions include a first comprehensive, math-grounded taxonomy of efficient prompting methods and a curated resource appendix of open-source projects.

Abstract

Prompting is a mainstream paradigm for adapting large language models to specific natural language processing tasks without modifying internal parameters. Therefore, detailed supplementary knowledge needs to be integrated into external prompts, which inevitably brings extra human efforts and computational burdens for practical applications. As an effective solution to mitigate resource consumption, Efficient Prompting Methods have attracted a wide range of attention. We provide mathematical expressions at a high level to deeply discuss Automatic Prompt Engineering for different prompt components and Prompt Compression in continuous and discrete spaces. Finally, we highlight promising future directions to inspire researchers interested in this field.
Paper Structure (27 sections, 3 equations, 7 figures, 6 tables)

This paper contains 27 sections, 3 equations, 7 figures, 6 tables.

Figures (7)

  • Figure 1: Taxonomy of efficient prompting methods.
  • Figure 2: An overview of efficient prompting methods.
  • Figure 3: The basic pipeline of automatic prompt engineering. Step 1: The discrete prompt space is expanded according to the customized optimization direction. Step 2: Candidate prompts are reasonably evaluated based on target model performance. Step 3: The optimal prompts are selected from the prompt pool using appropriate sampling strategies.
  • Figure 4: Various feedback signals contribute to specifying the optimization space of automatic prompt engineering, where the orange dashed line indicates the original prompt search space and the gray solid line represents the reduced search space that benefits from a more clear optimization direction.
  • Figure 5: Prompt compression in continuous and discrete space. T2V compression includes Internalizing system or user prompts into model parameters based on KD, and Encoding key information of hard prompts into soft prompts in an iterative or one-off way. T2T compression contains extractive and abstractive methods respectively are Pruning in various granularities, and Summarization for sufficient informativeness.
  • ...and 2 more figures