Efficient Prompting Methods for Large Language Models: A Survey
Kaiyan Chang, Songcheng Xu, Chenglong Wang, Yingfeng Luo, Xiaoqian Liu, Tong Xiao, Jingbo Zhu
TL;DR
This survey defines and systematizes efficient prompting for large language models by separating methods that reduce human effort (automatic prompt engineering) from those that cut computational costs (prompt compression). It introduces a mathematical framing for both prompt design and compression, then details instruction design, CoT optimization, and various editing, sampling, and feedback approaches, followed by continuous and discrete compression techniques including internalization, encoding, pruning, and summarization. The work highlights how these strategies can be combined to achieve comparable or superior performance with fewer resources, and it catalogs open-source tools to aid practitioners. By outlining key challenges and future directions, it aims to guide researchers toward robust, scalable, and interpretable resource-efficient prompting suitable for real-world deployment and progression toward AGI. The contributions include a first comprehensive, math-grounded taxonomy of efficient prompting methods and a curated resource appendix of open-source projects.
Abstract
Prompting is a mainstream paradigm for adapting large language models to specific natural language processing tasks without modifying internal parameters. Therefore, detailed supplementary knowledge needs to be integrated into external prompts, which inevitably brings extra human efforts and computational burdens for practical applications. As an effective solution to mitigate resource consumption, Efficient Prompting Methods have attracted a wide range of attention. We provide mathematical expressions at a high level to deeply discuss Automatic Prompt Engineering for different prompt components and Prompt Compression in continuous and discrete spaces. Finally, we highlight promising future directions to inspire researchers interested in this field.
