Table of Contents
Fetching ...

Semantic Constraint Inference for Web Form Test Generation

Parsa Alian, Noor Nashid, Mobina Shahbandeh, Ali Mesbah

TL;DR

FormNexus introduces Form Entity Relation Graphs (FERG) to capture semantic relationships among web form elements and leverages a step-by-step, LLM-guided constraint and value-generation workflow. By constructing Input Field Context and focusing on per-field semantics rather than entire HTML pages, it achieves high form submission state coverage (89% with GPT-4) and strong form-passage rates (83%). The approach combines embedding-based context, graph pruning, and a feedback loop to refine constraints, outperforming static baselines and other LLM-based methods. This work advances automated web form testing by integrating semantic inference with constrained LLM prompting and end-to-end test generation, offering practical benefits for robust form validation and testing pipelines.

Abstract

Automated test generation for web forms has been a longstanding challenge, exacerbated by the intrinsic human-centric design of forms and their complex, device-agnostic structures. We introduce an innovative approach, called FormNexus, for automated web form test generation, which emphasizes deriving semantic insights from individual form elements and relations among them, utilizing textual content, DOM tree structures, and visual proximity. The insights gathered are transformed into a new conceptual graph, the Form Entity Relation Graph (FERG), which offers machine-friendly semantic information extraction. Leveraging LLMs, FormNexus adopts a feedback-driven mechanism for generating and refining input constraints based on real-time form submission responses. The culmination of this approach is a robust set of test cases, each produced by methodically invalidating constraints, ensuring comprehensive testing scenarios for web forms. This work bridges the existing gap in automated web form testing by intertwining the capabilities of LLMs with advanced semantic inference methods. Our evaluation demonstrates that FormNexus combined with GPT-4 achieves 89% coverage in form submission states. This outcome significantly outstrips the performance of the best baseline model by a margin of 25%.

Semantic Constraint Inference for Web Form Test Generation

TL;DR

FormNexus introduces Form Entity Relation Graphs (FERG) to capture semantic relationships among web form elements and leverages a step-by-step, LLM-guided constraint and value-generation workflow. By constructing Input Field Context and focusing on per-field semantics rather than entire HTML pages, it achieves high form submission state coverage (89% with GPT-4) and strong form-passage rates (83%). The approach combines embedding-based context, graph pruning, and a feedback loop to refine constraints, outperforming static baselines and other LLM-based methods. This work advances automated web form testing by integrating semantic inference with constrained LLM prompting and end-to-end test generation, offering practical benefits for robust form validation and testing pipelines.

Abstract

Automated test generation for web forms has been a longstanding challenge, exacerbated by the intrinsic human-centric design of forms and their complex, device-agnostic structures. We introduce an innovative approach, called FormNexus, for automated web form test generation, which emphasizes deriving semantic insights from individual form elements and relations among them, utilizing textual content, DOM tree structures, and visual proximity. The insights gathered are transformed into a new conceptual graph, the Form Entity Relation Graph (FERG), which offers machine-friendly semantic information extraction. Leveraging LLMs, FormNexus adopts a feedback-driven mechanism for generating and refining input constraints based on real-time form submission responses. The culmination of this approach is a robust set of test cases, each produced by methodically invalidating constraints, ensuring comprehensive testing scenarios for web forms. This work bridges the existing gap in automated web form testing by intertwining the capabilities of LLMs with advanced semantic inference methods. Our evaluation demonstrates that FormNexus combined with GPT-4 achieves 89% coverage in form submission states. This outcome significantly outstrips the performance of the best baseline model by a margin of 25%.
Paper Structure (24 sections, 1 equation, 7 figures, 4 tables)

This paper contains 24 sections, 1 equation, 7 figures, 4 tables.

Figures (7)

  • Figure 1: Air Canada's multi-city flight reservation form
  • Figure 2: FERG embedding creation stage
  • Figure 3: Relating local textual context in the relation graph
  • Figure 4: Constraint prompt structure
  • Figure 5: Value prompt structure
  • ...and 2 more figures

Theorems & Definitions (2)

  • Definition 1: Form Submission State (FSS)
  • Definition 2