An approach for API synthesis using large language models
Hua Zhong, Shan Jiang, Sarfraz Khurshid
TL;DR
The paper tackles the challenge of API synthesis by leveraging large language models (LLMs) guided through prompt engineering to generate Java API implementations from method signatures and a small set of input/output examples. It introduces an end-to-end methodology comprising assistant role assignment, chain-of-thought reasoning, few-shot learning, and follow-up prompts to iteratively refine results without requiring a predefined component library. Across 135 real-world tasks, the approach achieves a 133/135 success rate with 100% compilability when follow-up prompts are used and produces readable code with meaningful names and comments; it also outperforms the state-of-the-art FrAngel across multiple benchmarks and demonstrates strong generalization to unseen tasks. The work demonstrates that LLMs can capture developer intent and manage complex control structures, offering a practical, interactive pathway for API synthesis with significant implications for rapid software development and tool design, with a strong emphasis on reproducibility through released prompts and benchmarks. The key finding is that prompt-engineered LLMs can effectively synthesize correct, test-passing, and readable APIs using minimal user input, potentially transforming how APIs are created and integrated in practice, with the area of $A= obreak \\pi ab$ illustrating the potential for nuanced mathematical reasoning within generated code.
Abstract
APIs play a pivotal role in modern software development by enabling seamless communication and integration between various systems, applications, and services. Component-based API synthesis is a form of program synthesis that constructs an API by assembling predefined components from a library. Existing API synthesis techniques typically implement dedicated search strategies over bounded spaces of possible implementations, which can be very large and time consuming to explore. In this paper, we present a novel approach of using large language models (LLMs) in API synthesis. LLMs offer a foundational technology to capture developer insights and provide an ideal framework for enabling more effective API synthesis. We perform an experimental evaluation of our approach using 135 real-world programming tasks, and compare it with FrAngel, a state-of-the-art API synthesis tool. The experimental results show that our approach completes 133 of the tasks, and overall outperforms FrAngel. We believe LLMs provide a very useful foundation for tackling the problem of API synthesis, in particular, and program synthesis, in general.
