RAPGen: An Approach for Fixing Code Inefficiencies in Zero-Shot

Spandan Garg; Roshanak Zilouchian Moghaddam; Neel Sundaresan

RAPGen: An Approach for Fixing Code Inefficiencies in Zero-Shot

Spandan Garg, Roshanak Zilouchian Moghaddam, Neel Sundaresan

TL;DR

The paper addresses the challenge of fixing performance bugs by introducing RAPGen, a Retrieval-Augmented Prompt Generation framework that uses a knowledge-base of past performance fixes to generate targeted prompts for zero-shot bug repair with LLMs. RAPGen constructs a Code Transformation Knowledge-base from GitHub PERF commits, retrieves an appropriate instruction, and prompts an LLM to produce a fix, avoiding expensive fine-tuning. Empirical results on the DeepDev-PERF dataset show RAPGen achieves correct suggestions in about 60% of cases and verbatim fixes in roughly 42%, with stronger performance in both automated and human evaluations, and in-the-wild evidence from real production codebases. This work demonstrates the viability of knowledge-base–guided prompt engineering for code repair, offering a low-cost alternative to fine-tuning and a path toward extending to other languages and bug types.

Abstract

Performance bugs are non-functional bugs that can even manifest in well-tested commercial products. Fixing these performance bugs is an important yet challenging problem. In this work, we address this challenge and present a new approach called Retrieval-Augmented Prompt Generation (RAPGen). Given a code snippet with a performance issue, RAPGen first retrieves a prompt instruction from a pre-constructed knowledge-base of previous performance bug fixes and then generates a prompt using the retrieved instruction. It then uses this prompt on a Large Language Model (such as Codex) in zero-shot to generate a fix. We compare our approach with the various prompt variations and state of the art methods in the task of performance bug fixing. Our evaluation shows that RAPGen can generate performance improvement suggestions equivalent or better than a developer in ~60% of the cases, getting ~42% of them verbatim, in an expert-verified dataset of past performance changes made by C# developers.

RAPGen: An Approach for Fixing Code Inefficiencies in Zero-Shot

TL;DR

Abstract

Paper Structure (21 sections, 10 figures, 2 tables, 1 algorithm)

This paper contains 21 sections, 10 figures, 2 tables, 1 algorithm.

Introduction
Motivating Example
Our Approach
Code Transformation Knowledge-base
Generating the Prompt
Generating Fixes
Empirical Evaluation
Experimental Setup
Comparing with Other Prompt Variants
Comparison with State-of-the-art
In-The-Wild Evaluation
Experimental Setup
Finding real world problems to fix
Reaching out to developers
Results
...and 6 more sections

Figures (10)

Figure 1: A C# code snippet with an expensive LINQ query (highlighted in red) from a performance bug fix commit on GitHub. LINQ tends to get misused by developers and can often lead to performance issues, such as the one above. In this case, the LINQ methods Where and FirstOrDefault are used to iterate over a collection to find all entries matching a predicate, when only the first match is needed and the search could potentially stop early. Depending on the size of the collection and how frequently this code gets invoked, this may become a performance hot-spot in an application. The screenshot below shows flamegraph corresponding to this application, with the relevant section highlighted. The call-stack shows the FirstOrDefault (its corresponding frame being highlighted in yellow) as being the most expensive line within the Undo method.
Figure 2: Model input prompt used to generate a fix for the method in Figure \ref{['example_before']}. The prompt consists of (i) the commented original buggy method, (ii) an instruction telling the model how to fix the issue, and (iii) the starting fragment of the buggy method.
Figure 3: Suggestion generated by the LLM when asked to complete the prompt in \ref{['example_prompt']}. The suggested fix is to unroll the LINQ query in favour of a foreach loop, which can stop early when a matching entry is found. This fix closely matches the developer fix and saves other potential LINQ overheads such as the allocations and GC.
Figure 4: The fix generation pipeline followed by RAPGen, showing how it is used at inference time to fix a performance issue given a buggy method and corresponding expensive line.
Figure 5: High level prompt template followed by the prompts generated by RAPGen. They consist of the buggy method itself, followed by the prompt instruction retrieved from the KB using the buggy line within the buggy method and finally the signature of the method itself, proceeded by an open curly brace.
...and 5 more figures

RAPGen: An Approach for Fixing Code Inefficiencies in Zero-Shot

TL;DR

Abstract

RAPGen: An Approach for Fixing Code Inefficiencies in Zero-Shot

Authors

TL;DR

Abstract

Table of Contents

Figures (10)