Table of Contents
Fetching ...

Private-Library-Oriented Code Generation with Large Language Models

Daoguang Zan, Bei Chen, Yongshun Gong, Junzhi Cao, Fengji Zhang, Bingchao Wu, Bei Guan, Yilong Yin, Yongji Wang

TL;DR

This work tackles private-library oriented code generation by introducing a retrieval-augmented framework with two modules: APIFinder, which retrieves relevant API documentation APIs using a dense dual-encoder retriever, and APICoder, which generates code that invokes these APIs. The authors further improve generator capabilities through CodeGenAPI, a continually pre-trained variant that ingests API information prior to code blocks. To evaluate the approach, they create four private-library benchmarks TorchDataEval, TorchDataComplexEval, MonkeyEval, BeatNumEval, and demonstrate that API basics and examples are most beneficial for prompting, with CodeGenAPI providing consistent gains over off-the-shelf baselines. The results show that private-library code generation is feasible with retrieval-augmented prompting, though there remain challenges such as noise handling, error types, and scalability to large API sets, highlighting directions for future tooling and privacy-aware development.

Abstract

Large language models (LLMs), such as Codex and GPT-4, have recently showcased their remarkable code generation abilities, facilitating a significant boost in coding efficiency. This paper will delve into utilizing LLMs for code generation in private libraries, as they are widely employed in everyday programming. Despite their remarkable capabilities, generating such private APIs poses a formidable conundrum for LLMs, as they inherently lack exposure to these private libraries during pre-training. To address this challenge, we propose a novel framework that emulates the process of programmers writing private code. This framework comprises two modules: APIFinder first retrieves potentially useful APIs from API documentation; and APICoder then leverages these retrieved APIs to generate private code. Specifically, APIFinder employs vector retrieval techniques and allows user involvement in the retrieval process. For APICoder, it can directly utilize off-the-shelf code generation models. To further cultivate explicit proficiency in invoking APIs from prompts, we continuously pre-train a reinforced version of APICoder, named CodeGenAPI. Our goal is to train the above two modules on vast public libraries, enabling generalization to private ones. Meanwhile, we create four private library benchmarks, including TorchDataEval, TorchDataComplexEval, MonkeyEval, and BeatNumEval, and meticulously handcraft test cases for each benchmark to support comprehensive evaluations. Numerous experiments on the four benchmarks consistently affirm the effectiveness of our approach. Furthermore, deeper analysis is also conducted to glean additional insights.

Private-Library-Oriented Code Generation with Large Language Models

TL;DR

This work tackles private-library oriented code generation by introducing a retrieval-augmented framework with two modules: APIFinder, which retrieves relevant API documentation APIs using a dense dual-encoder retriever, and APICoder, which generates code that invokes these APIs. The authors further improve generator capabilities through CodeGenAPI, a continually pre-trained variant that ingests API information prior to code blocks. To evaluate the approach, they create four private-library benchmarks TorchDataEval, TorchDataComplexEval, MonkeyEval, BeatNumEval, and demonstrate that API basics and examples are most beneficial for prompting, with CodeGenAPI providing consistent gains over off-the-shelf baselines. The results show that private-library code generation is feasible with retrieval-augmented prompting, though there remain challenges such as noise handling, error types, and scalability to large API sets, highlighting directions for future tooling and privacy-aware development.

Abstract

Large language models (LLMs), such as Codex and GPT-4, have recently showcased their remarkable code generation abilities, facilitating a significant boost in coding efficiency. This paper will delve into utilizing LLMs for code generation in private libraries, as they are widely employed in everyday programming. Despite their remarkable capabilities, generating such private APIs poses a formidable conundrum for LLMs, as they inherently lack exposure to these private libraries during pre-training. To address this challenge, we propose a novel framework that emulates the process of programmers writing private code. This framework comprises two modules: APIFinder first retrieves potentially useful APIs from API documentation; and APICoder then leverages these retrieved APIs to generate private code. Specifically, APIFinder employs vector retrieval techniques and allows user involvement in the retrieval process. For APICoder, it can directly utilize off-the-shelf code generation models. To further cultivate explicit proficiency in invoking APIs from prompts, we continuously pre-train a reinforced version of APICoder, named CodeGenAPI. Our goal is to train the above two modules on vast public libraries, enabling generalization to private ones. Meanwhile, we create four private library benchmarks, including TorchDataEval, TorchDataComplexEval, MonkeyEval, and BeatNumEval, and meticulously handcraft test cases for each benchmark to support comprehensive evaluations. Numerous experiments on the four benchmarks consistently affirm the effectiveness of our approach. Furthermore, deeper analysis is also conducted to glean additional insights.
Paper Structure (46 sections, 3 equations, 13 figures, 8 tables)

This paper contains 46 sections, 3 equations, 13 figures, 8 tables.

Figures (13)

  • Figure 1: An example of code generation on public library (pandas) and private library (monkey) by Codex $12$B and a programmer.
  • Figure 2: One simple Python example of code generation. $\blacksquare$ represents code library likes pandas; $\blacksquare$ represents some code snippets, functions, or classes; $\blacksquare$ represents a natural language description of programming problem; $\blacksquare$ represents target code that solves the programming problem in $\blacksquare$, and may call APIs from $\blacksquare$ and $\blacksquare$.
  • Figure 3: An API instance of API documentation. It showcases the primary components of each API in API documentation: API name, signature, description, parameters, related APIs, and API examples.
  • Figure 4: Schematic diagram of our framework: APIFinder first retrieves potentially useful APIs from API documentation, and then APICoder generates the target code based on the retrieval APIs.
  • Figure 5: An overview of the preparation of training corpus for APIFinder.
  • ...and 8 more figures