ARC: Compiling Hundreds of Requirement Scenarios into A Runnable Web System

Weiyu Kong; Yun Lin; Xiwen Teoh; Duc-Minh Nguyen; Ruofei Ren; Jiaxin Chang; Haoxu Hu; Haoyu Chen

ARC: Compiling Hundreds of Requirement Scenarios into A Runnable Web System

Weiyu Kong, Yun Lin, Xiwen Teoh, Duc-Minh Nguyen, Ruofei Ren, Jiaxin Chang, Haoxu Hu, Haoyu Chen

TL;DR

ARC presents a requirement-centric alternative to stochastic code generation by introducing a graph-based DSL that encodes multi-modal requirements as a directed acyclic graph. It employs a bidirectional, test-driven loop consisting of a top-down architecture construction phase and a bottom-up constrained code generation phase, ensuring strict interface contracts and full traceability from requirements to code. Across six real-world web systems, ARC achieves substantial gains in GUI test pass rates (average improvement ~50.6% over baselines) and demonstrates reliable maintainability via its traceability records and modular interfaces. A user study with 21 novice participants shows that DSL-based requirement drafting is approachable and effective for compiling production-grade repositories, albeit with concerns about compilation time and the need for explicit requirements. Overall, ARC demonstrates that formal requirement compilation can produce maintainable, runnable software and offers a scalable path for large-scale AI-assisted software engineering.

Abstract

Large Language Models (LLMs) have improved programming efficiency, but their performance degrades significantly as requirements scale; when faced with multi-modal documents containing hundreds of scenarios, LLMs often produce incorrect implementations or omit constraints. We propose Agentic Requirement Compilation (ARC), a technique that moves beyond simple code generation to requirement compilation, enabling the creation of runnable web systems directly from multi-modal DSL documents. ARC generates not only source code but also modular designs for UI, API, and database layers, enriched test suites (unit, modular, and integration), and detailed traceability for software maintenance. Our approach employs a bidirectional test-driven agentic loop: a top-down architecture phase decomposes requirements into verifiable interfaces, followed by a bottom-up implementation phase where agents generate code to satisfy those tests. ARC maintains strict traceability across requirements, design, and code to facilitate intelligent asset reuse. We evaluated ARC by generating six runnable web systems from documents spanning 50-200 multi-modal scenarios. Compared to state-of-the-art baselines, ARC-generated systems pass 50.6% more GUI tests on average. A user study with 21 participants showed that novice users can successfully write DSL documents for complex systems, such as a 10K-line ticket-booking system, in an average of 5.6 hours. These results demonstrate that ARC effectively transforms non-trivial requirement specifications into maintainable, runnable software.

ARC: Compiling Hundreds of Requirement Scenarios into A Runnable Web System

TL;DR

Abstract

Paper Structure (51 sections, 6 equations, 9 figures, 5 tables, 3 algorithms)

This paper contains 51 sections, 6 equations, 9 figures, 5 tables, 3 algorithms.

Introduction
Preliminaries
Formal Definition and Meta-Model of Requirement
An Example
Problem Statement
Approach
Overall Algorithm
Top-down: Requirement & Design Decomposition
Generating Interfaces
Generating Test Suites
Bottom-up: Constrained Code Generation
Implementation of Leaf Nodes
Implementation of Non-Leaf Nodes
The Reactive Verification Loop
Evaluation
...and 36 more sections

Figures (9)

Figure 1: The meta-model (or schema) of the multi-modal requirement of ARC. Each requirement node is equipped with a multi-modal description (via text and picture). Each requirement node and scenario is assigned with an identifier for reference. Each step is described with action and expectation via Given, When, and Then keywords.
Figure 2: An instance of multi-modal requirement document conforming to our DSL. Blue boxes (ROOT, REQ-) represent requirement nodes, orange boxes (SCE-) represent scenarios, and yellow boxes represent steps.
Figure 3: An overview of ARC to parse a multi-modal requirement into a runnable web system. In addition to the web system, the resultant software artifacts include (1) a software architecture consisting of a set of interfaces (UI, API, and database), (2) a test suite for each generated interface, and (3) a traceability record to capture all the provenance from the requirement to the design, test, and implementation.
Figure 4: Screenshots of the generated BookStack system.
Figure 5: Screenshots of the generated Keep system.
...and 4 more figures

ARC: Compiling Hundreds of Requirement Scenarios into A Runnable Web System

TL;DR

Abstract

ARC: Compiling Hundreds of Requirement Scenarios into A Runnable Web System

Authors

TL;DR

Abstract

Table of Contents

Figures (9)