Table of Contents
Fetching ...

NL-Debugging: Exploiting Natural Language as an Intermediate Representation for Code Debugging

Weiming Zhang, Qingyao Li, Xinyi Dai, Jizheng Chen, Kounianhua Du, Weiwen Liu, Yasheng Wang, Ruiming Tang, Yong Yu, Weinan Zhang

TL;DR

NL-Debugging tackles the challenge of debugging code with complex algorithmic errors by using natural language as an intermediate representation. The method comprises Backtranslation, Natural Language Refinement, and Regeneration, enabling iterative refinement guided by runtime feedback. Results on APPS and Codeforces show that NL-Debugging with Sketch NL representations yields substantial performance gains and expands the modification space compared with traditional code-focused debugging. The work highlights the value of natural language reasoning for automated software debugging and outlines future directions including tree-search, multi-granularity debugging, and application to broader software reasoning tasks.

Abstract

Debugging is a critical aspect of LLM's coding ability. Early debugging efforts primarily focused on code-level analysis, which often falls short when addressing complex programming errors that require a deeper understanding of algorithmic logic. Recent advancements in large language models (LLMs) have shifted attention toward leveraging natural language reasoning to enhance code-related tasks. However, two fundamental questions remain unanswered: What type of natural language format is most effective for debugging tasks? And what specific benefits does natural language reasoning bring to the debugging process? In this paper, we introduce NL-DEBUGGING, a novel framework that employs natural language as an intermediate representation to improve code debugging. By debugging at a natural language level, we demonstrate that NL-DEBUGGING outperforms traditional debugging methods and enables a broader modification space through direct refinement guided by execution feedback. Our findings highlight the potential of natural language reasoning to advance automated code debugging and address complex programming challenges.

NL-Debugging: Exploiting Natural Language as an Intermediate Representation for Code Debugging

TL;DR

NL-Debugging tackles the challenge of debugging code with complex algorithmic errors by using natural language as an intermediate representation. The method comprises Backtranslation, Natural Language Refinement, and Regeneration, enabling iterative refinement guided by runtime feedback. Results on APPS and Codeforces show that NL-Debugging with Sketch NL representations yields substantial performance gains and expands the modification space compared with traditional code-focused debugging. The work highlights the value of natural language reasoning for automated software debugging and outlines future directions including tree-search, multi-granularity debugging, and application to broader software reasoning tasks.

Abstract

Debugging is a critical aspect of LLM's coding ability. Early debugging efforts primarily focused on code-level analysis, which often falls short when addressing complex programming errors that require a deeper understanding of algorithmic logic. Recent advancements in large language models (LLMs) have shifted attention toward leveraging natural language reasoning to enhance code-related tasks. However, two fundamental questions remain unanswered: What type of natural language format is most effective for debugging tasks? And what specific benefits does natural language reasoning bring to the debugging process? In this paper, we introduce NL-DEBUGGING, a novel framework that employs natural language as an intermediate representation to improve code debugging. By debugging at a natural language level, we demonstrate that NL-DEBUGGING outperforms traditional debugging methods and enables a broader modification space through direct refinement guided by execution feedback. Our findings highlight the potential of natural language reasoning to advance automated code debugging and address complex programming challenges.

Paper Structure

This paper contains 46 sections, 6 figures, 14 tables.

Figures (6)

  • Figure 1: The NL-Debugging framework. This iterative process includes backtranslation, refinement, and regeneration, ultimately improving debugging efficiency by utilizing natural language reasoning.
  • Figure 2: Pass rate (%) comparison of NL-Debugging and other code-level debugging methods with more debugging iterations.
  • Figure 3: Pass rate (%) comparison of NL2NL and NL2C sampling across different difficulty levels.
  • Figure 4: Pass@1 (%) comparison across different natural language debugging approaches.
  • Figure 5: Pass rate (%) comparison of NL-Debugging and other code debugging methods with more debugging iterations on both datasets.
  • ...and 1 more figures