Table of Contents
Fetching ...

ARLO: A Tailorable Approach for Transforming Natural Language Software Requirements into Architecture using LLMs

Tooraj Helmi

TL;DR

ARLO addresses the challenge of deriving software architecture from natural-language requirements by combining zero-shot LLM-based extraction of architecturally significant requirements (ASRs) and quality attributes (QAs) with an optimization-based decision process. It introduces architecturally influencing requirements (AIRs), grouping strategies (CGs and CCGs), and a tailorable QA-architecture matrix to guide architecture selection through an ILP objective: maximize the weighted QA satisfaction across condition groups, formalized as $\text{Score}_j = \sum_{i=1}^{n} M_{ij} \cdot x_i$ and $\max \sum_{j=1}^{m} (\text{Score}_j \cdot W_j)$ subject to group constraints. The approach is validated on three real-world systems (Bamboo, Aptana, Spring XD), demonstrating scalability with increasing requirements and sensitivity to matrix configuration and AIRs, as well as the nuanced impact of Concurrent Condition Groups on architectural decisions. ARLO’s contributions include an end-to-end NL-to-architecture workflow, traceability from ASRs to QAs, and publicly available tooling, offering a data-driven, transparent means to explore architecture alternatives aligned with requirements. The work has practical significance for architects seeking reproducible, auditable decisions and for researchers pursuing automation in requirements-to-architecture mapping.

Abstract

Software requirements expressed in natural language (NL) frequently suffer from verbosity, ambiguity, and inconsistency. This creates a range of challenges, including selecting an appropriate architecture for a system and assessing different architectural alternatives. Relying on human expertise to accomplish the task of mapping NL requirements to architecture is time-consuming and error-prone. This paper proposes ARLO, an approach that automates this task by leveraging (1) a set of NL requirements for a system, (2) an existing standard that specifies architecturally relevant software quality attributes, and (3) a readily available Large Language Model (LLM). Specifically, ARLO determines the subset of NL requirements for a given system that is architecturally relevant and maps that subset to a tailorable matrix of architectural choices. ARLO applies integer linear programming on the architectural-choice matrix to determine the optimal architecture for the current requirements. We demonstrate ARLO's efficacy using a set of real-world examples. We highlight ARLO's ability (1) to trace the selected architectural choices to the requirements and (2) to isolate NL requirements that exert a particular influence on a system's architecture. This allows the identification, comparative assessment, and exploration of alternative architectural choices based on the requirements and constraints expressed therein.

ARLO: A Tailorable Approach for Transforming Natural Language Software Requirements into Architecture using LLMs

TL;DR

ARLO addresses the challenge of deriving software architecture from natural-language requirements by combining zero-shot LLM-based extraction of architecturally significant requirements (ASRs) and quality attributes (QAs) with an optimization-based decision process. It introduces architecturally influencing requirements (AIRs), grouping strategies (CGs and CCGs), and a tailorable QA-architecture matrix to guide architecture selection through an ILP objective: maximize the weighted QA satisfaction across condition groups, formalized as and subject to group constraints. The approach is validated on three real-world systems (Bamboo, Aptana, Spring XD), demonstrating scalability with increasing requirements and sensitivity to matrix configuration and AIRs, as well as the nuanced impact of Concurrent Condition Groups on architectural decisions. ARLO’s contributions include an end-to-end NL-to-architecture workflow, traceability from ASRs to QAs, and publicly available tooling, offering a data-driven, transparent means to explore architecture alternatives aligned with requirements. The work has practical significance for architects seeking reproducible, auditable decisions and for researchers pursuing automation in requirements-to-architecture mapping.

Abstract

Software requirements expressed in natural language (NL) frequently suffer from verbosity, ambiguity, and inconsistency. This creates a range of challenges, including selecting an appropriate architecture for a system and assessing different architectural alternatives. Relying on human expertise to accomplish the task of mapping NL requirements to architecture is time-consuming and error-prone. This paper proposes ARLO, an approach that automates this task by leveraging (1) a set of NL requirements for a system, (2) an existing standard that specifies architecturally relevant software quality attributes, and (3) a readily available Large Language Model (LLM). Specifically, ARLO determines the subset of NL requirements for a given system that is architecturally relevant and maps that subset to a tailorable matrix of architectural choices. ARLO applies integer linear programming on the architectural-choice matrix to determine the optimal architecture for the current requirements. We demonstrate ARLO's efficacy using a set of real-world examples. We highlight ARLO's ability (1) to trace the selected architectural choices to the requirements and (2) to isolate NL requirements that exert a particular influence on a system's architecture. This allows the identification, comparative assessment, and exploration of alternative architectural choices based on the requirements and constraints expressed therein.

Paper Structure

This paper contains 21 sections, 2 equations, 5 figures, 14 tables, 1 algorithm.

Figures (5)

  • Figure 1: Overview of ARLO
  • Figure 2: ARLO's view of a software system
  • Figure 3: An excerpt from ARLO's Step 1 Output for UMS
  • Figure 4: Clustering ASRs before Applying Algorithm \ref{['algo-concern-formation']}
  • Figure 5: The AIR set size distribution