PATCH: Empowering Large Language Model with Programmer-Intent Guidance and Collaborative-Behavior Simulation for Automatic Bug Fixing
Yuwei Zhang, Zhi Jin, Ying Xing, Ge Li, Fang Liu, Jiaxin Zhu, Wensheng Dou, Jun Wei
TL;DR
PATCH addresses the gap where automatic bug fixing with LLMs treats bugs as a single-step task and leverages limited buggy-code context. By introducing a stage-wise framework with programmer-intent guidance and collaborative-behavior simulation, PATCH employs three ChatGPT agents to cover bug reporting, diagnosis, patch generation, and verification, while augmenting input with dependence context and commit messages. Evaluations on the augmented BFP benchmark show PATCH outperforms 13 baselines, with substantial improvements in Fix@1 and related metrics, and ablation studies confirm the value of each component and interaction turns. The work demonstrates robust generalizability across APR benchmarks and open-source LLMs, suggesting a practical path toward plug-and-play improvement of automatic bug fixing in real-world software development.
Abstract
Bug fixing holds significant importance in software development and maintenance. Recent research has made substantial strides in exploring the potential of large language models (LLMs) for automatically resolving software bugs. However, a noticeable gap in existing approaches lies in the oversight of collaborative facets intrinsic to bug resolution, treating the process as a single-stage endeavor. Moreover, most approaches solely take the buggy code snippet as input for LLMs during the patch generation stage. To mitigate the aforementioned limitations, we introduce a novel stage-wise framework named PATCH. Specifically, we first augment the buggy code snippet with corresponding dependence context and intent information to better guide LLMs in generating the correct candidate patches. Additionally, by taking inspiration from bug management practices, we decompose the bug-fixing task into four distinct stages: bug reporting, bug diagnosis, patch generation, and patch verification. These stages are performed interactively by LLMs, aiming to simulate the collaborative behavior of programmers during the resolution of software bugs. By harnessing these collective contributions, PATCH effectively enhances the bug-fixing capability of LLMs. We implement PATCH by employing the powerful dialogue-based LLM ChatGPT. Our evaluation on the widely used bug-fixing benchmark BFP demonstrates that PATCH has achieved better performance than state-of-the-art LLMs.
