Table of Contents
Fetching ...

ReqBrain: Task-Specific Instruction Tuning of LLMs for AI-Assisted Requirements Generation

Mohammad Kasra Habib, Daniel Graziotin, Stefan Wagner

TL;DR

ReqBrain tackles the labor-intensive practice of elicitation and specification by fine-tuning 7B-scale LLMs with a task-specific instruction dataset aligned to ISO 29148. The approach demonstrates that fine-tuning yields authentic and adequate requirements indistinguishable from human-authored ones and surpasses untuned baselines and a general-purpose model in both automated metrics (BERT Score and FRUGAL) and human evaluations. The work contributes an instruct dataset and open-source models, enabling reproducibility and RAG-based extensions for enterprise data. The findings support integrating AI-assisted requirements generation into early development phases, with future work extending to defect identification, test-case generation, and agile user story creation.

Abstract

Requirements elicitation and specification remains a labor-intensive, manual process prone to inconsistencies and gaps, presenting a significant challenge in modern software engineering. Emerging studies underscore the potential of employing large language models (LLMs) for automated requirements generation to support requirements elicitation and specification; however, it remains unclear how to implement this effectively. In this work, we introduce ReqBrain, an Al-assisted tool that employs a fine-tuned LLM to generate authentic and adequate software requirements. Software engineers can engage with ReqBrain through chat-based sessions to automatically generate software requirements and categorize them by type. We curated a high-quality dataset of ISO 29148-compliant requirements and fine-tuned five 7B-parameter LLMs to determine the most effective base model for ReqBrain. The top-performing model, Zephyr-7b-beta, achieved 89.30\% Fl using the BERT score and a FRUGAL score of 91.20 in generating authentic and adequate requirements. Human evaluations further confirmed ReqBrain's effectiveness in generating requirements. Our findings suggest that generative Al, when fine-tuned, has the potential to improve requirements elicitation and specification, paving the way for future extensions into areas such as defect identification, test case generation, and agile user story creation.

ReqBrain: Task-Specific Instruction Tuning of LLMs for AI-Assisted Requirements Generation

TL;DR

ReqBrain tackles the labor-intensive practice of elicitation and specification by fine-tuning 7B-scale LLMs with a task-specific instruction dataset aligned to ISO 29148. The approach demonstrates that fine-tuning yields authentic and adequate requirements indistinguishable from human-authored ones and surpasses untuned baselines and a general-purpose model in both automated metrics (BERT Score and FRUGAL) and human evaluations. The work contributes an instruct dataset and open-source models, enabling reproducibility and RAG-based extensions for enterprise data. The findings support integrating AI-assisted requirements generation into early development phases, with future work extending to defect identification, test-case generation, and agile user story creation.

Abstract

Requirements elicitation and specification remains a labor-intensive, manual process prone to inconsistencies and gaps, presenting a significant challenge in modern software engineering. Emerging studies underscore the potential of employing large language models (LLMs) for automated requirements generation to support requirements elicitation and specification; however, it remains unclear how to implement this effectively. In this work, we introduce ReqBrain, an Al-assisted tool that employs a fine-tuned LLM to generate authentic and adequate software requirements. Software engineers can engage with ReqBrain through chat-based sessions to automatically generate software requirements and categorize them by type. We curated a high-quality dataset of ISO 29148-compliant requirements and fine-tuned five 7B-parameter LLMs to determine the most effective base model for ReqBrain. The top-performing model, Zephyr-7b-beta, achieved 89.30\% Fl using the BERT score and a FRUGAL score of 91.20 in generating authentic and adequate requirements. Human evaluations further confirmed ReqBrain's effectiveness in generating requirements. Our findings suggest that generative Al, when fine-tuned, has the potential to improve requirements elicitation and specification, paving the way for future extensions into areas such as defect identification, test case generation, and agile user story creation.

Paper Structure

This paper contains 53 sections, 5 equations, 3 figures, 11 tables.

Figures (3)

  • Figure 1: AI-assisted generation of software requirements using ReqBrain.
  • Figure 2: AI-assisted requirements generation approach overview, integrating ReqBrain.
  • Figure 3: Performance metrics across three task categories (see Section \ref{['sssec:instruct_dataset']}).