Table of Contents
Fetching ...

RECOVER: Toward Requirements Generation from Stakeholders' Conversations

Gianmario Voria, Francesco Casillo, Carmine Gravino, Gemma Catolino, Fabio Palomba

TL;DR

RECOVER presents a three-step pipeline to extract and generate system requirements from stakeholder conversations, combining ML-based classification, context-preserving processing, and LLM-driven generation. The approach prioritizes recall at the turn level and uses a structured evaluation with expert oracles and a ChatGPT baseline to demonstrate added value over generic LLM prompting. Across turn-level and whole-conversation analyses, RECOVER shows promising accuracy, completeness, and actionability, with in-vivo validation indicating robustness in noisy industrial settings, albeit with necessary human oversight to mitigate hallucinations and ensure traceability. The work advances conversational requirements engineering by enabling automated yet human-validated elicitation, and it provides replication materials and a path for extending to non-functional requirements and traceability features.

Abstract

Stakeholders' conversations in requirements elicitation meetings hold valuable insights into system and client needs. However, manually extracting requirements is time-consuming, labor-intensive, and prone to errors and biases. While current state-of-the-art methods assist in summarizing stakeholder conversations and classifying requirements based on their nature, there is a noticeable lack of approaches capable of both identifying requirements within these conversations and generating corresponding system requirements. These approaches would assist requirement identification, reducing engineers' workload, time, and effort. To address this gap, this paper introduces RECOVER (Requirements EliCitation frOm conVERsations), a novel conversational requirements engineering approach that leverages natural language processing and large language models (LLMs) to support practitioners in automatically extracting system requirements from stakeholder interactions. The approach is evaluated using a mixed-method study that combines performance analysis with a user study involving requirements engineers, targeting two levels of granularity. First, at the conversation turn level, the evaluation measures RECOVER's accuracy in identifying requirements-relevant dialogue and the quality of generated requirements in terms of correctness, completeness, and actionability. Second, at the entire conversation level, the evaluation assesses the overall usefulness and effectiveness of RECOVER in synthesizing comprehensive system requirements from full stakeholder discussions. Empirical evaluation of RECOVER shows promising performance, with generated requirements demonstrating satisfactory correctness, completeness, and actionability. The results also highlight the potential of automating requirements elicitation from conversations as an aid that enhances efficiency while maintaining human oversight

RECOVER: Toward Requirements Generation from Stakeholders' Conversations

TL;DR

RECOVER presents a three-step pipeline to extract and generate system requirements from stakeholder conversations, combining ML-based classification, context-preserving processing, and LLM-driven generation. The approach prioritizes recall at the turn level and uses a structured evaluation with expert oracles and a ChatGPT baseline to demonstrate added value over generic LLM prompting. Across turn-level and whole-conversation analyses, RECOVER shows promising accuracy, completeness, and actionability, with in-vivo validation indicating robustness in noisy industrial settings, albeit with necessary human oversight to mitigate hallucinations and ensure traceability. The work advances conversational requirements engineering by enabling automated yet human-validated elicitation, and it provides replication materials and a path for extending to non-functional requirements and traceability features.

Abstract

Stakeholders' conversations in requirements elicitation meetings hold valuable insights into system and client needs. However, manually extracting requirements is time-consuming, labor-intensive, and prone to errors and biases. While current state-of-the-art methods assist in summarizing stakeholder conversations and classifying requirements based on their nature, there is a noticeable lack of approaches capable of both identifying requirements within these conversations and generating corresponding system requirements. These approaches would assist requirement identification, reducing engineers' workload, time, and effort. To address this gap, this paper introduces RECOVER (Requirements EliCitation frOm conVERsations), a novel conversational requirements engineering approach that leverages natural language processing and large language models (LLMs) to support practitioners in automatically extracting system requirements from stakeholder interactions. The approach is evaluated using a mixed-method study that combines performance analysis with a user study involving requirements engineers, targeting two levels of granularity. First, at the conversation turn level, the evaluation measures RECOVER's accuracy in identifying requirements-relevant dialogue and the quality of generated requirements in terms of correctness, completeness, and actionability. Second, at the entire conversation level, the evaluation assesses the overall usefulness and effectiveness of RECOVER in synthesizing comprehensive system requirements from full stakeholder discussions. Empirical evaluation of RECOVER shows promising performance, with generated requirements demonstrating satisfactory correctness, completeness, and actionability. The results also highlight the potential of automating requirements elicitation from conversations as an aid that enhances efficiency while maintaining human oversight

Paper Structure

This paper contains 22 sections, 6 figures, 6 tables.

Figures (6)

  • Figure 1: An overview of the main steps performed by RECOVER.
  • Figure 2: Running example of RECOVER.
  • Figure 3: Overview of the research method proposed for our study.
  • Figure 4: Example of the pairwise comparisons performed to answer RQ1.
  • Figure 5: Results distribution in the different groups.
  • ...and 1 more figures