Variability-Driven User-Story Generation using LLM and Triadic Concept Analysis
Alexandre Bazin, Alain Gutierrez, Marianne Huchard, Pierre Martin, Yulin, Zhang
TL;DR
This work tackles generating a new system's user-story set in Software Product Lines by leveraging a 3-dimensional dataset ($system$, $role$, $feature$) and combining Triadic Concept Analysis ($TCA$) with Large Language Model prompting. It presents a two-stage process: domain engineering builds a variability model and design options from existing system families via $TCA$ implications, while application engineering uses designer selections and LLM prompts to produce and refine an initial user-story set, then extends it with triadic-consistent additions. Evaluation on a dataset from 67 websites (1546 triples; 687 $system \times (role;feature)$ implications) across 20 LLM conversations shows that design options stabilize and LLM can generate near-valid user-story sets, though there is variability in applying implications and in augmentation during refinement. The hybrid approach demonstrates practical potential for guiding domain- and application-engineering tasks in SPL, with future extensions toward richer triadic or polyadic analyses and multi-LLM coordination to reduce randomness.
Abstract
A widely used Agile practice for requirements is to produce a set of user stories (also called ``agile product backlog''), which roughly includes a list of pairs (role, feature), where the role handles the feature for a certain purpose. In the context of Software Product Lines, the requirements for a family of similar systems is thus a family of user-story sets, one per system, leading to a 3-dimensional dataset composed of sets of triples (system, role, feature). In this paper, we combine Triadic Concept Analysis (TCA) and Large Language Model (LLM) prompting to suggest the user-story set required to develop a new system relying on the variability logic of an existing system family. This process consists in 1) computing 3-dimensional variability expressed as a set of TCA implications, 2) providing the designer with intelligible design options, 3) capturing the designer's selection of options, 4) proposing a first user-story set corresponding to this selection, 5) consolidating its validity according to the implications identified in step 1, while completing it if necessary, and 6) leveraging LLM to have a more comprehensive website. This process is evaluated with a dataset comprising the user-story sets of 67 similar-purpose websites.
