Table of Contents
Fetching ...

No Man is an Island: Towards Fully Automatic Programming by Code Search, Code Generation and Program Repair

Quanjun Zhang, Chunrong Fang, Ye Shang, Tongke Zhang, Shengcheng Yu, Zhenyu Chen

TL;DR

This paper tackles automatic programming by integrating three complementary research areas: code search, code generation, and program repair, using a unified framework called Cream. Cream employs two retrieval strategies (IR-based and DL-based) to fetch relevant code, augments LLM-driven code generation with retrieved examples, and applies test-driven dynamic repair to refine outputs. Preliminary experiments on MBPP with CodeLlama and real-world tasks from CoderEval demonstrate meaningful gains over generation-only baselines, illustrating the mutual benefits of retrieval-guided generation and runtime feedback. The work highlights a practical path for leveraging traditional software engineering tools alongside modern LLMs to enhance automatic programming and suggests directions for broader deployment and evaluation.

Abstract

Automatic programming attempts to minimize human intervention in the generation of executable code, and has been a long-standing challenge in the software engineering community. To advance automatic programming, researchers are focusing on three primary directions: (1) code search that reuses existing code snippets from external databases; (2) code generation that produces new code snippets from natural language; and (3) program repair that refines existing code snippets by fixing detected bugs. Despite significant advancements, the effectiveness of state-of-the-art techniques is still limited, such as the usability of searched code and the correctness of generated code. Motivated by the real-world programming process, where developers usually use various external tools to aid their coding processes, such as code search engines and code testing tools, in this work, we propose \toolname{}, an automatic programming framework that leverages recent large language models (LLMs) to integrate the three research areas to address their inherent limitations. In particular, our framework first leverages different code search strategies to retrieve similar code snippets, which are then used to further guide the code generation process of LLMs. Our framework further validates the quality of generated code by compilers and test cases, and constructs repair prompts to query LLMs for generating correct patches. We conduct preliminary experiments to demonstrate the potential of our framework, \eg helping CodeLlama solve 267 programming problems with an improvement of 62.53\%. As a generic framework, \toolname{} can integrate various code search, generation, and repair tools, combining these three research areas together for the first time. More importantly, it demonstrates the potential of using traditional SE tools to enhance the usability of LLMs in automatic programming.

No Man is an Island: Towards Fully Automatic Programming by Code Search, Code Generation and Program Repair

TL;DR

This paper tackles automatic programming by integrating three complementary research areas: code search, code generation, and program repair, using a unified framework called Cream. Cream employs two retrieval strategies (IR-based and DL-based) to fetch relevant code, augments LLM-driven code generation with retrieved examples, and applies test-driven dynamic repair to refine outputs. Preliminary experiments on MBPP with CodeLlama and real-world tasks from CoderEval demonstrate meaningful gains over generation-only baselines, illustrating the mutual benefits of retrieval-guided generation and runtime feedback. The work highlights a practical path for leveraging traditional software engineering tools alongside modern LLMs to enhance automatic programming and suggests directions for broader deployment and evaluation.

Abstract

Automatic programming attempts to minimize human intervention in the generation of executable code, and has been a long-standing challenge in the software engineering community. To advance automatic programming, researchers are focusing on three primary directions: (1) code search that reuses existing code snippets from external databases; (2) code generation that produces new code snippets from natural language; and (3) program repair that refines existing code snippets by fixing detected bugs. Despite significant advancements, the effectiveness of state-of-the-art techniques is still limited, such as the usability of searched code and the correctness of generated code. Motivated by the real-world programming process, where developers usually use various external tools to aid their coding processes, such as code search engines and code testing tools, in this work, we propose \toolname{}, an automatic programming framework that leverages recent large language models (LLMs) to integrate the three research areas to address their inherent limitations. In particular, our framework first leverages different code search strategies to retrieve similar code snippets, which are then used to further guide the code generation process of LLMs. Our framework further validates the quality of generated code by compilers and test cases, and constructs repair prompts to query LLMs for generating correct patches. We conduct preliminary experiments to demonstrate the potential of our framework, \eg helping CodeLlama solve 267 programming problems with an improvement of 62.53\%. As a generic framework, \toolname{} can integrate various code search, generation, and repair tools, combining these three research areas together for the first time. More importantly, it demonstrates the potential of using traditional SE tools to enhance the usability of LLMs in automatic programming.
Paper Structure (18 sections, 2 equations, 8 figures, 1 algorithm)

This paper contains 18 sections, 2 equations, 8 figures, 1 algorithm.

Figures (8)

  • Figure 1: A common programming scenario during software development
  • Figure 2: The overall workflow of this paper
  • Figure 3: Retrieval-Augmented Code Generation Prompt Template
  • Figure 4: Test-Driven Program Repair Prompt Template
  • Figure 5: The number of MBPP problems correctly solved by CodeLlama
  • ...and 3 more figures