A Survey on Automated Program Repair Techniques
Kai Huang, Zhengzi Xu, Su Yang, Hongyu Sun, Xuejun Li, Zheng Yan, Yuqing Zhang
TL;DR
This survey provides a structured overview of Automated Program Repair (APR), classifying techniques into search-based, constraint-based, template-based, and learning-based paradigms, and inventories their evolution, key strengths, and limitations. It introduces a uniform evaluation framework and discusses current progress, challenges, datasets, and industrial deployment, with a forward-looking emphasis on large language models and hybrid approaches. The work highlights critical issues such as patch overfitting, dataset quality, and benchmark biases, and argues for more comprehensive empirical studies and cross-domain knowledge transfer to accelerate real-world adoption. Overall, the paper maps a trajectory from traditional heuristic repair to data-driven and LLM-enabled repair, outlining concrete directions for improving repair quality, efficiency, and generality across languages and defect types.
Abstract
With the rapid development and large-scale popularity of program software, modern society increasingly relies on software systems. However, the problems exposed by software have also come to the fore. Software defect has become an important factor troubling developers. In this context, Automated Program Repair (APR) techniques have emerged, aiming to automatically fix software defect problems and reduce manual debugging work. In particular, benefiting from the advances in deep learning, numerous learning-based APR techniques have emerged in recent years, which also bring new opportunities for APR research. To give researchers a quick overview of APR techniques' complete development and future opportunities, we revisit the evolution of APR techniques and discuss in depth the latest advances in APR research. In this paper, the development of APR techniques is introduced in terms of four different patch generation schemes: search-based, constraint-based, template-based, and learning-based. Moreover, we propose a uniform set of criteria to review and compare each APR tool, summarize the advantages and disadvantages of APR techniques, and discuss the current state of APR development. Furthermore, we introduce the research on the related technical areas of APR that have also provided a strong motivation to advance APR development. Finally, we analyze current challenges and future directions, especially highlighting the critical opportunities that large language models bring to APR research.
