Language Models in Software Development Tasks: An Experimental Analysis of Energy and Accuracy

Negar Alizadeh; Boris Belchev; Nishant Saurabh; Patricia Kelbert; Fernando Castor

Language Models in Software Development Tasks: An Experimental Analysis of Energy and Accuracy

Negar Alizadeh, Boris Belchev, Nishant Saurabh, Patricia Kelbert, Fernando Castor

TL;DR

The findings reveal that employing a big LLM with a higher energy budget does not always translate to significantly improved accuracy, and quantized versions of large models generally offer better efficiency and accuracy compared to full-precision versions of medium-sized ones.

Abstract

The use of generative AI-based coding assistants like ChatGPT and Github Copilot is a reality in contemporary software development. Many of these tools are provided as remote APIs. Using third-party APIs raises data privacy and security concerns for client companies, which motivates the use of locally-deployed language models. In this study, we explore the trade-off between model accuracy and energy consumption, aiming to provide valuable insights to help developers make informed decisions when selecting a language model. We investigate the performance of 18 families of LLMs in typical software development tasks on two real-world infrastructures, a commodity GPU and a powerful AI-specific GPU. Given that deploying LLMs locally requires powerful infrastructure which might not be affordable for everyone, we consider both full-precision and quantized models. Our findings reveal that employing a big LLM with a higher energy budget does not always translate to significantly improved accuracy. Additionally, quantized versions of large models generally offer better efficiency and accuracy compared to full-precision versions of medium-sized ones. Apart from that, not a single model is suitable for all types of software development tasks.

Language Models in Software Development Tasks: An Experimental Analysis of Energy and Accuracy

TL;DR

Abstract

Language Models in Software Development Tasks: An Experimental Analysis of Energy and Accuracy

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (9)