Less is More: Adaptive Program Repair with Bug Localization and Preference Learning
Zhenlong Dai, Bingrui Chen, Zhuoluo Zhao, Xiu Tang, Sai Wu, Chang Yao, Zhipeng Gao, Jingyuan Chen
TL;DR
Adaptive Program Repair (AdaPR) tackles generating correct patches while preserving consistency with the original buggy code and minimizing edits. The authors introduce AdaPatcher, a two-stage framework consisting of a Bug Locator (diff-based localization with self-debug learning) and a Program Modifier (location-aware repair learning, hybrid reference, and adaptive preference learning) to produce patches with minimal changes. They present the ACPR dataset with over 52k samples and demonstrate substantial gains in both patch accuracy and code consistency over baselines including GPT-4o and CodeLlama variants. The work advances practical APR by improving traceability and reducing unintended code changes, laying groundwork for broader adoption and future enhancements in adaptive, minimal-change repair.
Abstract
Automated Program Repair (APR) is a task to automatically generate patches for the buggy code. However, most research focuses on generating correct patches while ignoring the consistency between the fixed code and the original buggy code. How to conduct adaptive bug fixing and generate patches with minimal modifications have seldom been investigated. To bridge this gap, we first introduce a novel task, namely AdaPR (Adaptive Program Repair). We then propose a two-stage approach AdaPatcher (Adaptive Patch Generator) to enhance program repair while maintaining the consistency. In the first stage, we utilize a Bug Locator with self-debug learning to accurately pinpoint bug locations. In the second stage, we train a Program Modifier to ensure consistency between the post-modified fixed code and the pre-modified buggy code. The Program Modifier is enhanced with a location-aware repair learning strategy to generate patches based on identified buggy lines, a hybrid training strategy for selective reference and an adaptive preference learning to prioritize fewer changes. The experimental results show that our approach outperforms a set of baselines by a large margin, validating the effectiveness of our two-stage framework for the newly proposed AdaPR task.
