Table of Contents
Fetching ...

LecPrompt: A Prompt-based Approach for Logical Error Correction with CodeBERT

Zhenyu Xu, Victor S. Sheng

TL;DR

The soft-prompt method provides a novel solution in low-cost scenarios, ensuring that the model can be fine-tuned to the specific nuances of the logical error correction task without incurring high computational costs.

Abstract

Logical errors in programming don't raise compiler alerts, making them hard to detect. These silent errors can disrupt a program's function or cause run-time issues. Their correction requires deep insight into the program's logic, highlighting the importance of automated detection and repair. In this paper, we introduce LecPrompt to localize and repair logical errors, an prompt-based approach that harnesses the capabilities of CodeBERT, a transformer-based large language model trained on code. First, LecPrompt leverages a large language model to calculate perplexity and log probability metrics, pinpointing logical errors at both token and line levels. Through statistical analysis, it identifies tokens and lines that deviate significantly from the expected patterns recognized by large language models, marking them as potential error sources. Second, by framing the logical error correction challenge as a Masked Language Modeling (MLM) task, LecPrompt employs CodeBERT to autoregressively repair the identified error tokens. Finally, the soft-prompt method provides a novel solution in low-cost scenarios, ensuring that the model can be fine-tuned to the specific nuances of the logical error correction task without incurring high computational costs. To evaluate LecPrompt's performance, we created a method to introduce logical errors into correct code and applying this on QuixBugs to produce the QuixBugs-LE dataset. Our evaluations on the QuixBugs-LE dataset for both Python and Java highlight the impressive capabilities of our method, LecPrompt. For Python, LecPrompt achieves a noteworthy 74.58% top-1 token-level repair accuracy and 27.4% program-level repair accuracy. In Java, LecPrompt delivers a 69.23\% top-1 token-level repair accuracy and 24.7% full program-level repair accuracy.

LecPrompt: A Prompt-based Approach for Logical Error Correction with CodeBERT

TL;DR

The soft-prompt method provides a novel solution in low-cost scenarios, ensuring that the model can be fine-tuned to the specific nuances of the logical error correction task without incurring high computational costs.

Abstract

Logical errors in programming don't raise compiler alerts, making them hard to detect. These silent errors can disrupt a program's function or cause run-time issues. Their correction requires deep insight into the program's logic, highlighting the importance of automated detection and repair. In this paper, we introduce LecPrompt to localize and repair logical errors, an prompt-based approach that harnesses the capabilities of CodeBERT, a transformer-based large language model trained on code. First, LecPrompt leverages a large language model to calculate perplexity and log probability metrics, pinpointing logical errors at both token and line levels. Through statistical analysis, it identifies tokens and lines that deviate significantly from the expected patterns recognized by large language models, marking them as potential error sources. Second, by framing the logical error correction challenge as a Masked Language Modeling (MLM) task, LecPrompt employs CodeBERT to autoregressively repair the identified error tokens. Finally, the soft-prompt method provides a novel solution in low-cost scenarios, ensuring that the model can be fine-tuned to the specific nuances of the logical error correction task without incurring high computational costs. To evaluate LecPrompt's performance, we created a method to introduce logical errors into correct code and applying this on QuixBugs to produce the QuixBugs-LE dataset. Our evaluations on the QuixBugs-LE dataset for both Python and Java highlight the impressive capabilities of our method, LecPrompt. For Python, LecPrompt achieves a noteworthy 74.58% top-1 token-level repair accuracy and 27.4% program-level repair accuracy. In Java, LecPrompt delivers a 69.23\% top-1 token-level repair accuracy and 24.7% full program-level repair accuracy.

Paper Structure

This paper contains 36 sections, 5 equations, 7 figures, 6 tables, 1 algorithm.

Figures (7)

  • Figure 1: A comparison between hard and soft prompts. While hard prompts utilize a pre-designed template to elicit model outputs, soft prompts use continuously tunable embeddings to adapt various tasks. Where <mask>: delicious or disgust.
  • Figure 2: Log Probability of Each Token
  • Figure 4: CodeBERT is used to predict and fill the two <mask> tokens in the input program. The process is as follows: (a) Use CodeBERT to predict the top 1 candidate for the first <mask> token, denoted as '&', and substitute the first <mask> with '&'. (b) Next, use CodeBERT to predict the second <mask> token, and substitute it with the top 1 candidate, denoted as '+='. (c) With both <mask> tokens now filled, the correction process is complete.
  • Figure 5: Model Achitecture
  • Figure 6: Top-1 prediction accuracies for different parameters. (a) Top-1 prediction accuracy of CodeBERT-MLM-PT vs. soft-prompt length from 0 to 100. (b) Top-1 token prediction accuracy of CodeBERT-MLM-FT vs. masked percentage.
  • ...and 2 more figures