Table of Contents
Fetching ...

An approach for API synthesis using large language models

Hua Zhong, Shan Jiang, Sarfraz Khurshid

TL;DR

The paper tackles the challenge of API synthesis by leveraging large language models (LLMs) guided through prompt engineering to generate Java API implementations from method signatures and a small set of input/output examples. It introduces an end-to-end methodology comprising assistant role assignment, chain-of-thought reasoning, few-shot learning, and follow-up prompts to iteratively refine results without requiring a predefined component library. Across 135 real-world tasks, the approach achieves a 133/135 success rate with 100% compilability when follow-up prompts are used and produces readable code with meaningful names and comments; it also outperforms the state-of-the-art FrAngel across multiple benchmarks and demonstrates strong generalization to unseen tasks. The work demonstrates that LLMs can capture developer intent and manage complex control structures, offering a practical, interactive pathway for API synthesis with significant implications for rapid software development and tool design, with a strong emphasis on reproducibility through released prompts and benchmarks. The key finding is that prompt-engineered LLMs can effectively synthesize correct, test-passing, and readable APIs using minimal user input, potentially transforming how APIs are created and integrated in practice, with the area of $A= obreak \\pi ab$ illustrating the potential for nuanced mathematical reasoning within generated code.

Abstract

APIs play a pivotal role in modern software development by enabling seamless communication and integration between various systems, applications, and services. Component-based API synthesis is a form of program synthesis that constructs an API by assembling predefined components from a library. Existing API synthesis techniques typically implement dedicated search strategies over bounded spaces of possible implementations, which can be very large and time consuming to explore. In this paper, we present a novel approach of using large language models (LLMs) in API synthesis. LLMs offer a foundational technology to capture developer insights and provide an ideal framework for enabling more effective API synthesis. We perform an experimental evaluation of our approach using 135 real-world programming tasks, and compare it with FrAngel, a state-of-the-art API synthesis tool. The experimental results show that our approach completes 133 of the tasks, and overall outperforms FrAngel. We believe LLMs provide a very useful foundation for tackling the problem of API synthesis, in particular, and program synthesis, in general.

An approach for API synthesis using large language models

TL;DR

The paper tackles the challenge of API synthesis by leveraging large language models (LLMs) guided through prompt engineering to generate Java API implementations from method signatures and a small set of input/output examples. It introduces an end-to-end methodology comprising assistant role assignment, chain-of-thought reasoning, few-shot learning, and follow-up prompts to iteratively refine results without requiring a predefined component library. Across 135 real-world tasks, the approach achieves a 133/135 success rate with 100% compilability when follow-up prompts are used and produces readable code with meaningful names and comments; it also outperforms the state-of-the-art FrAngel across multiple benchmarks and demonstrates strong generalization to unseen tasks. The work demonstrates that LLMs can capture developer intent and manage complex control structures, offering a practical, interactive pathway for API synthesis with significant implications for rapid software development and tool design, with a strong emphasis on reproducibility through released prompts and benchmarks. The key finding is that prompt-engineered LLMs can effectively synthesize correct, test-passing, and readable APIs using minimal user input, potentially transforming how APIs are created and integrated in practice, with the area of illustrating the potential for nuanced mathematical reasoning within generated code.

Abstract

APIs play a pivotal role in modern software development by enabling seamless communication and integration between various systems, applications, and services. Component-based API synthesis is a form of program synthesis that constructs an API by assembling predefined components from a library. Existing API synthesis techniques typically implement dedicated search strategies over bounded spaces of possible implementations, which can be very large and time consuming to explore. In this paper, we present a novel approach of using large language models (LLMs) in API synthesis. LLMs offer a foundational technology to capture developer insights and provide an ideal framework for enabling more effective API synthesis. We perform an experimental evaluation of our approach using 135 real-world programming tasks, and compare it with FrAngel, a state-of-the-art API synthesis tool. The experimental results show that our approach completes 133 of the tasks, and overall outperforms FrAngel. We believe LLMs provide a very useful foundation for tackling the problem of API synthesis, in particular, and program synthesis, in general.

Paper Structure

This paper contains 20 sections, 7 figures, 3 tables.

Figures (7)

  • Figure 1: getOffsetForLine(above) & ellipseArea(below) APIs produced by FrAngel
  • Figure 2: getOffsetForLine (above) & ellipseArea (below) APIs produced by the proposed methodology
  • Figure 3: Workflow of the proposed API synthesis approach
  • Figure 4: One of the two programs generated by LLMs that failed to pass all test cases
  • Figure 5: Two false positives produced by FrAngel(left), and the corresponding correct programs.
  • ...and 2 more figures