APITestGenie: Automated API Test Generation through Generative AI

André Pereira; Bruno Lima; João Pascoal Faria

APITestGenie: Automated API Test Generation through Generative AI

André Pereira, Bruno Lima, João Pascoal Faria

TL;DR

This article introduces APITestGenie, an approach and tool that leverages LLMs to generate executable API test scripts from business requirements and API specifications, and recommends human intervention to validate or refine generated scripts before integration into CI/CD pipelines.

Abstract

Intelligent assistants powered by Large Language Models (LLMs) can generate program and test code with high accuracy, boosting developers' and testers' productivity. However, there is a lack of studies exploring LLMs for testing Web APIs, which constitute fundamental building blocks of modern software systems and pose significant test challenges. Hence, in this article, we introduce APITestGenie, an approach and tool that leverages LLMs to generate executable API test scripts from business requirements and API specifications. In experiments with 10 real-world APIs, the tool generated valid test scripts 57% of the time. With three generation attempts per task, this success rate increased to 80%. Human intervention is recommended to validate or refine generated scripts before integration into CI/CD pipelines, positioning our tool as a productivity assistant rather than a replacement for testers. Feedback from industry specialists indicated a strong interest in adopting our tool for improving the API test process.

APITestGenie: Automated API Test Generation through Generative AI

TL;DR

Abstract

Paper Structure (11 sections, 4 figures, 2 tables)

This paper contains 11 sections, 4 figures, 2 tables.

RELATED WORK
SOLUTION
Architecture and Workflow
System and User Prompt
Prompt with the API Specification
Prompt with the Business Requirement
EVALUATION
Experimental Evaluation
User Feedback
Answers to Research Questions
CONCLUSION

Figures (4)

Figure 1: APITestGenie flow diagram, showcasing the interactions of the main flows in the system, inputs and outputs.
Figure 2: Characterization of the APIs under test based on their internal complexity and the level of detail of the documentation, with associated prompt levels recommended.
Figure 3: Estimated test generation success probability by number of attempts and prompt level (L1 to L3) or overall (All). $valid@k$ shows the average probability of generating at least one valid test script in $k$ attempts across all test generation tasks. Overall, $valid@1 \approx 57.3\%$ and $valid@3 \approx 80\%$.
Figure 4: Staff opinions on APITestGenie generated tests. Opinions from 11 industry experts from our industry partner with over five years of experience in the product were collected anonymously after a workshop presentation.

APITestGenie: Automated API Test Generation through Generative AI

TL;DR

Abstract

APITestGenie: Automated API Test Generation through Generative AI

Authors

TL;DR

Abstract

Table of Contents

Figures (4)