Table of Contents
Fetching ...

LLMs as Continuous Learners: Improving the Reproduction of Defective Code in Software Issues

Yalan Lin, Yingwei Ma, Rongyu Cao, Binhua Li, Fei Huang, Xiaodong Gu, Yongbin Li

TL;DR

EvoCoder is proposed, a multi-agent continuous learning framework for issue code reproduction that adopts a reflection mechanism that allows the LLM to continuously learn from previously resolved problems and dynamically refine its strategies to new emerging challenges.

Abstract

Reproducing buggy code is the first and crucially important step in issue resolving, as it aids in identifying the underlying problems and validating that generated patches resolve the problem. While numerous approaches have been proposed for this task, they primarily address common, widespread errors and struggle to adapt to unique, evolving errors specific to individual code repositories. To fill this gap, we propose EvoCoder, a multi-agent continuous learning framework for issue code reproduction. EvoCoder adopts a reflection mechanism that allows the LLM to continuously learn from previously resolved problems and dynamically refine its strategies to new emerging challenges. To prevent experience bloating, EvoCoder introduces a novel hierarchical experience pool that enables the model to adaptively update common and repo-specific experiences. Our experimental results show a 20\% improvement in issue reproduction rates over existing SOTA methods. Furthermore, integrating our reproduction mechanism significantly boosts the overall accuracy of the existing issue-resolving pipeline.

LLMs as Continuous Learners: Improving the Reproduction of Defective Code in Software Issues

TL;DR

EvoCoder is proposed, a multi-agent continuous learning framework for issue code reproduction that adopts a reflection mechanism that allows the LLM to continuously learn from previously resolved problems and dynamically refine its strategies to new emerging challenges.

Abstract

Reproducing buggy code is the first and crucially important step in issue resolving, as it aids in identifying the underlying problems and validating that generated patches resolve the problem. While numerous approaches have been proposed for this task, they primarily address common, widespread errors and struggle to adapt to unique, evolving errors specific to individual code repositories. To fill this gap, we propose EvoCoder, a multi-agent continuous learning framework for issue code reproduction. EvoCoder adopts a reflection mechanism that allows the LLM to continuously learn from previously resolved problems and dynamically refine its strategies to new emerging challenges. To prevent experience bloating, EvoCoder introduces a novel hierarchical experience pool that enables the model to adaptively update common and repo-specific experiences. Our experimental results show a 20\% improvement in issue reproduction rates over existing SOTA methods. Furthermore, integrating our reproduction mechanism significantly boosts the overall accuracy of the existing issue-resolving pipeline.

Paper Structure

This paper contains 29 sections, 6 figures, 2 tables.

Figures (6)

  • Figure 1: An example of issue reproduction for the KernelPCA library. The Expected Results section specifies the ground truth results for the original problem. The Actual Results section specifies the erroneous results that need to be reproduced based on the given steps. As seen in the upper right, the reproduced code successfully produces the "Actual result" in the issue (the three vectors are different in sign). To resolve this issue, a patch is needed to fix the bug. As seen in the lower right, applying the patch aligns the result with the "Expected result", ensuring that all three vectors are now correctly uniform in sign.
  • Figure 2: Examples of the characteristics of the encountered errors.
  • Figure 3: The overall framework of our approach
  • Figure 4: Illustration of the issue reproduction process
  • Figure 5: The transition matrix of error types from CodeR to our method. The number in each cell denotes the percentage of errors in CodeR that has transitioned to EvoCoder
  • ...and 1 more figures