Table of Contents
Fetching ...

GenIA-E2ETest: A Generative AI-Based Approach for End-to-End Test Automation

Elvis Júnior, Alan Valejo, Jorge Valverde-Rebaza, Vânia de Oliveira Neves

TL;DR

GenIA-E2ETest presents an open-source pipeline that converts natural-language test scenarios into executable E2E test scripts using a three-level prompting strategy. The approach maps scenarios to modular JSON representations, extracts and refines UI elements with LLM-guided prompts, and generates Robot Framework tests leveraging SeleniumLibrary. Empirical evaluation across two web apps demonstrates strong element-identity and execution performance (average element precision 77%, execution precision 82%, recall 85%), with minimal manual adaptation (~10%), though robustness decreases in context-dependent or dynamic content scenarios. The work indicates the practicality of AI-assisted E2E test generation to accelerate test creation and broaden accessibility, while highlighting challenges in preserving context and handling dynamic or ambiguous UI structures; artifacts and prompts are openly available for reproducibility and extension.

Abstract

Software testing is essential to ensure system quality, but it remains time-consuming and error-prone when performed manually. Although recent advances in Large Language Models (LLMs) have enabled automated test generation, most existing solutions focus on unit testing and do not address the challenges of end-to-end (E2E) testing, which validates complete application workflows from user input to final system response. This paper introduces GenIA-E2ETest, which leverages generative AI to generate executable E2E test scripts from natural language descriptions automatically. We evaluated the approach on two web applications, assessing completeness, correctness, adaptation effort, and robustness. Results were encouraging: the scripts achieved an average of 77% for both element metrics, 82% for precision of execution, 85% for execution recall, required minimal manual adjustments (average manual modification rate of 10%), and showed consistent performance in typical web scenarios. Although some sensitivity to context-dependent navigation and dynamic content was observed, the findings suggest that GenIA-E2ETest is a practical and effective solution to accelerate E2E test automation from natural language, reducing manual effort and broadening access to automated testing.

GenIA-E2ETest: A Generative AI-Based Approach for End-to-End Test Automation

TL;DR

GenIA-E2ETest presents an open-source pipeline that converts natural-language test scenarios into executable E2E test scripts using a three-level prompting strategy. The approach maps scenarios to modular JSON representations, extracts and refines UI elements with LLM-guided prompts, and generates Robot Framework tests leveraging SeleniumLibrary. Empirical evaluation across two web apps demonstrates strong element-identity and execution performance (average element precision 77%, execution precision 82%, recall 85%), with minimal manual adaptation (~10%), though robustness decreases in context-dependent or dynamic content scenarios. The work indicates the practicality of AI-assisted E2E test generation to accelerate test creation and broaden accessibility, while highlighting challenges in preserving context and handling dynamic or ambiguous UI structures; artifacts and prompts are openly available for reproducibility and extension.

Abstract

Software testing is essential to ensure system quality, but it remains time-consuming and error-prone when performed manually. Although recent advances in Large Language Models (LLMs) have enabled automated test generation, most existing solutions focus on unit testing and do not address the challenges of end-to-end (E2E) testing, which validates complete application workflows from user input to final system response. This paper introduces GenIA-E2ETest, which leverages generative AI to generate executable E2E test scripts from natural language descriptions automatically. We evaluated the approach on two web applications, assessing completeness, correctness, adaptation effort, and robustness. Results were encouraging: the scripts achieved an average of 77% for both element metrics, 82% for precision of execution, 85% for execution recall, required minimal manual adjustments (average manual modification rate of 10%), and showed consistent performance in typical web scenarios. Although some sensitivity to context-dependent navigation and dynamic content was observed, the findings suggest that GenIA-E2ETest is a practical and effective solution to accelerate E2E test automation from natural language, reducing manual effort and broadening access to automated testing.

Paper Structure

This paper contains 25 sections, 1 figure, 3 tables.

Figures (1)

  • Figure 1: Overview of the GenIA-E2ETest approach and multi-level prompting strategy