Table of Contents
Fetching ...

Adapting Knowledge Prompt Tuning for Enhanced Automated Program Repair

Xuemeng Cai, Lingxiao Jiang

TL;DR

The paper tackles Automated Program Repair (APR) under data scarcity by adapting prompt tuning with knowledge prompts to guide large language models. It contrasts traditional fine-tuning with prompt-based approaches, introducing both hard/soft basic prompts and six domain-knowledge prompts, and evaluates across three model sizes and six APR benchmarks in four languages. The results show that prompt tuning, especially when augmented with relevant domain knowledge, consistently outperforms fine-tuning in scarce data settings, with GPT-Neo and CodeT5 variants benefiting differently across datasets. The work provides practical guidance on prompt design, initialization, and knowledge integration for APR, highlighting the potential for data-efficient patch generation in real-world scenarios, and makes its code and data publicly available.

Abstract

Automated Program Repair (APR) aims to enhance software reliability by automatically generating bug-fixing patches. Recent work has improved the state-of-the-art of APR by fine-tuning pre-trained large language models (LLMs), such as CodeT5, for APR. However, the effectiveness of fine-tuning becomes weakened in data scarcity scenarios, and data scarcity can be a common issue in practice, limiting fine-tuning performance. To alleviate this limitation, this paper adapts prompt tuning for enhanced APR and conducts a comprehensive study to evaluate its effectiveness in data scarcity scenarios, using three LLMs of different sizes and six diverse datasets across four programming languages. Prompt tuning rewrites the input to a model by adding extra prompt tokens and tunes both the model and the prompts on a small dataset. These tokens provide task-specific knowledge that can improve the model for APR, which is especially critical in data scarcity scenarios. Moreover, domain knowledge has proven crucial in many code intelligence tasks, but existing studies fail to leverage domain knowledge during the prompt tuning for APR. To close this gap, we introduce knowledge prompt tuning, an approach that adapts prompt tuning with six distinct types of code- or bug-related domain knowledge for APR. Our work, to the best of our knowledge, is the first to adapt and evaluate prompt tuning and the effectiveness of code- or bug-related domain knowledge for APR, particularly under data scarcity settings. Our evaluation results demonstrate that prompt tuning with knowledge generally outperforms fine-tuning under various experimental settings, achieving an average improvement of 87.33% over fine-tuning in data scarcity scenarios.

Adapting Knowledge Prompt Tuning for Enhanced Automated Program Repair

TL;DR

The paper tackles Automated Program Repair (APR) under data scarcity by adapting prompt tuning with knowledge prompts to guide large language models. It contrasts traditional fine-tuning with prompt-based approaches, introducing both hard/soft basic prompts and six domain-knowledge prompts, and evaluates across three model sizes and six APR benchmarks in four languages. The results show that prompt tuning, especially when augmented with relevant domain knowledge, consistently outperforms fine-tuning in scarce data settings, with GPT-Neo and CodeT5 variants benefiting differently across datasets. The work provides practical guidance on prompt design, initialization, and knowledge integration for APR, highlighting the potential for data-efficient patch generation in real-world scenarios, and makes its code and data publicly available.

Abstract

Automated Program Repair (APR) aims to enhance software reliability by automatically generating bug-fixing patches. Recent work has improved the state-of-the-art of APR by fine-tuning pre-trained large language models (LLMs), such as CodeT5, for APR. However, the effectiveness of fine-tuning becomes weakened in data scarcity scenarios, and data scarcity can be a common issue in practice, limiting fine-tuning performance. To alleviate this limitation, this paper adapts prompt tuning for enhanced APR and conducts a comprehensive study to evaluate its effectiveness in data scarcity scenarios, using three LLMs of different sizes and six diverse datasets across four programming languages. Prompt tuning rewrites the input to a model by adding extra prompt tokens and tunes both the model and the prompts on a small dataset. These tokens provide task-specific knowledge that can improve the model for APR, which is especially critical in data scarcity scenarios. Moreover, domain knowledge has proven crucial in many code intelligence tasks, but existing studies fail to leverage domain knowledge during the prompt tuning for APR. To close this gap, we introduce knowledge prompt tuning, an approach that adapts prompt tuning with six distinct types of code- or bug-related domain knowledge for APR. Our work, to the best of our knowledge, is the first to adapt and evaluate prompt tuning and the effectiveness of code- or bug-related domain knowledge for APR, particularly under data scarcity settings. Our evaluation results demonstrate that prompt tuning with knowledge generally outperforms fine-tuning under various experimental settings, achieving an average improvement of 87.33% over fine-tuning in data scarcity scenarios.

Paper Structure

This paper contains 41 sections, 2 equations, 5 figures, 10 tables.

Figures (5)

  • Figure 1: The examples of inputs to CodeT5+ in the paradigm of fine-tuning and prompt tuning. $[X]$ is the slot for buggy code. The orange rectangles represent the prompt tokens, which can be either fixed natural language tokens or learnable soft tokens during prompt tuning.
  • Figure 2: Illustration on hard prompt and soft prompt, where [X] and [mask] indicate the input slot and output slot respectively.
  • Figure 3: Overview of experimental design
  • Figure 4: An example of buggy code from Defect4J successfully fixed by knowledge prompts
  • Figure 5: Results of fine-tuning and prompt tuning across different training set sizes