Code Roulette: How Prompt Variability Affects LLM Code Generation

Andrei Paleyes; Radzim Sendyka; Diana Robinson; Christian Cabrera; Neil D. Lawrence

Code Roulette: How Prompt Variability Affects LLM Code Generation

Andrei Paleyes, Radzim Sendyka, Diana Robinson, Christian Cabrera, Neil D. Lawrence

Abstract

Code generation is one of the most active areas of application of Large Language Models (LLMs). While LLMs lower barriers to writing code and accelerate development process, the overall quality of generated programs depends on the quality of given prompts. Specifically, functionality and quality of generated code can be sensitive to user's background and familiarity with software development. It is therefore important to quantify LLM's sensitivity to variations in the input. To this end we propose an evaluation pipeline for LLM code generation with a focus on measuring sensitivity to prompt augmentations, completely agnostic to a specific programming tasks and LLMs, and thus widely applicable. We provide extensive experimental evidence illustrating utility of our method and share our code for the benefit of the community.

Code Roulette: How Prompt Variability Affects LLM Code Generation

Abstract

Code Roulette: How Prompt Variability Affects LLM Code Generation

Abstract

Paper Structure

Table of Contents

Figures (6)