The Right Prompts for the Job: Repair Code-Review Defects with Large Language Model

Zelin Zhao; Zhaogui Xu; Jialong Zhu; Peng Di; Yuan Yao; Xiaoxing Ma

The Right Prompts for the Job: Repair Code-Review Defects with Large Language Model

Zelin Zhao, Zhaogui Xu, Jialong Zhu, Peng Di, Yuan Yao, Xiaoxing Ma

TL;DR

Experimental results demonstrate a remarkable repair rate of 72.97% with the best prompt, highlighting a substantial improvement in the effectiveness and practicality of automatic repair techniques.

Abstract

Automatic program repair (APR) techniques have the potential to reduce manual efforts in uncovering and repairing program defects during the code review (CR) process. However, the limited accuracy and considerable time costs associated with existing APR approaches hinder their adoption in industrial practice. One key factor is the under-utilization of review comments, which provide valuable insights into defects and potential fixes. Recent advancements in Large Language Models (LLMs) have enhanced their ability to comprehend natural and programming languages, enabling them to generate patches based on review comments. This paper conducts a comprehensive investigation into the effective utilization of LLMs for repairing CR defects. In this study, various prompts are designed and compared across mainstream LLMs using two distinct datasets from human reviewers and automated checkers. Experimental results demonstrate a remarkable repair rate of 72.97% with the best prompt, highlighting a substantial improvement in the effectiveness and practicality of automatic repair techniques.

The Right Prompts for the Job: Repair Code-Review Defects with Large Language Model

TL;DR

Experimental results demonstrate a remarkable repair rate of 72.97% with the best prompt, highlighting a substantial improvement in the effectiveness and practicality of automatic repair techniques.

Abstract

Paper Structure (26 sections, 8 figures, 5 tables)

This paper contains 26 sections, 8 figures, 5 tables.

Introduction
Code Review
Defect Identification
Repairing
Research Questions and Experiment Settings
Models
Prompts
Prompts for Zero-shot Learning
Prompts for Finetuning
Datasets
Dataset from Reviewer Comments (RD)
Dataset from the PMD Checker (PD)
Result Validation
Experiment results
Overall Effectiveness (RQ1)
...and 11 more sections

Figures (8)

Figure 1: The code review process in industrial practice. Red lines denote the new steps and components to cooperate with CLM.
Figure 2: CR defects identified by reviewers.
Figure 3: CR defects identified by automated checker, PMD.
Figure 4: The template of prompts in this paper.
Figure 5: A buggy code snippet that was only fixed by Code-LLaMA with P7. [FIX_START] is before line 2 and [FIX_END] is after line 10, they are omitted for simplicity.
...and 3 more figures

The Right Prompts for the Job: Repair Code-Review Defects with Large Language Model

TL;DR

Abstract

The Right Prompts for the Job: Repair Code-Review Defects with Large Language Model

Authors

TL;DR

Abstract

Table of Contents

Figures (8)