Efficient Methods for Natural Language Processing: A Survey
Marcos Treviso, Ji-Ung Lee, Tianchu Ji, Betty van Aken, Qingqing Cao, Manuel R. Ciosici, Michael Hassid, Kenneth Heafield, Sara Hooker, Colin Raffel, Pedro H. Martins, André F. T. Martins, Jessica Zosa Forde, Peter Milder, Edwin Simpson, Noam Slonim, Jesse Dodge, Emma Strubell, Niranjan Balasubramanian, Leon Derczynski, Iryna Gurevych, Roy Schwartz
TL;DR
This survey addresses the resource and energy bottlenecks in modern NLP by cataloging a broad set of efficiency approaches spanning data usage, model design, pre-training, fine-tuning, inference, and hardware. It highlights concrete methods such as data deduplication, active and curriculum learning, sparse modeling, retrieval-augmented architectures, adapters and LoRA for parameter efficiency, pruning, distillation, and quantization, as well as hardware-aware co-design and edge deployment strategies. The work emphasizes that efficiency is multi-faceted, requiring careful evaluation along Pareto fronts with metrics like FLOP/s, power, and carbon emissions, and it calls for standardized reporting and cross-stage analysis to meaningfully compare methods. Overall, the paper outlines concrete, scalable directions for reducing NLP compute without sacrificing performance and stresses the importance of algorithm-hardware co-design for real-world impact.
Abstract
Recent work in natural language processing (NLP) has yielded appealing results from scaling model parameters and training data; however, using only scale to improve performance means that resource consumption also grows. Such resources include data, time, storage, or energy, all of which are naturally limited and unevenly distributed. This motivates research into efficient methods that require fewer resources to achieve similar results. This survey synthesizes and relates current methods and findings in efficient NLP. We aim to provide both guidance for conducting NLP under limited resources, and point towards promising research directions for developing more efficient methods.
