Prompt engineering and its implications on the energy consumption of Large Language Models
Riccardo Rubei, Aicha Moussaid, Claudio di Sipio, Davide di Ruscio
TL;DR
This work investigates how prompt engineering, specifically custom tags and explanations, affects the energy consumption of Llama 3 during code completion tasks using CodeXGLUE. By evaluating five prompt configurations across 1,000 snippets with CodeCarbon-based energy measurements, the study shows that tailored prompts can reduce GPU energy use and sometimes improve inference speed without sacrificing accuracy. The best configurations (notably those with explicit explanations) achieve meaningful energy savings and accuracy gains, while configurations lacking system-level prompts can degrade reliability. The findings suggest that prompt design is a practical lever for reducing the carbon footprint of LLM inference in software engineering tasks and motivate broader cross-model and cross-task investigations.
Abstract
Reducing the environmental impact of AI-based software systems has become critical. The intensive use of large language models (LLMs) in software engineering poses severe challenges regarding computational resources, data centers, and carbon emissions. In this paper, we investigate how prompt engineering techniques (PETs) can impact the carbon emission of the Llama 3 model for the code generation task. We experimented with the CodeXGLUE benchmark to evaluate both energy consumption and the accuracy of the generated code using an isolated testing environment. Our initial results show that the energy consumption of LLMs can be reduced by using specific tags that distinguish different prompt parts. Even though a more in-depth evaluation is needed to confirm our findings, this work suggests that prompt engineering can reduce LLMs' energy consumption during the inference phase without compromising performance, paving the way for further investigations.
