Table of Contents
Fetching ...

PassionNet: An Innovative Framework for Duplicate and Conflicting Requirements Identification

Summra Saleem, Muhammad Nabeel Asim, Andreas Dengel

TL;DR

PassionNet tackles the duplicate and conflicting requirements problem in Requirements Engineering by marrying semantic insights from large language models with traditional and advanced similarity measures. It defines three predictive pipeline families, culminating in a hybrid approach that fuses LLM context with multimodel similarity knowledge. Across six public benchmarks, hybrid pipelines consistently outperform the other types, achieving up to 13% higher macro F1 than prior state-of-the-art predictors. The framework demonstrates robust performance across diverse datasets and holds promise for broad applicability in software engineering and other textual similarity tasks.

Abstract

Early detection and resolution of duplicate and conflicting requirements can significantly enhance project efficiency and overall software quality. Researchers have developed various computational predictors by leveraging Artificial Intelligence (AI) potential to detect duplicate and conflicting requirements. However, these predictors lack in performance and requires more effective approaches to empower software development processes. Following the need of a unique predictor that can accurately identify duplicate and conflicting requirements, this research offers a comprehensive framework that facilitate development of 3 different types of predictive pipelines: language models based, multi-model similarity knowledge-driven and large language models (LLMs) context + multi-model similarity knowledge-driven. Within first type predictive pipelines landscape, framework facilitates conflicting/duplicate requirements identification by leveraging 8 distinct types of LLMs. In second type, framework supports development of predictive pipelines that leverage multi-scale and multi-model similarity knowledge, ranging from traditional similarity computation methods to advanced similarity vectors generated by LLMs. In the third type, the framework synthesizes predictive pipelines by integrating contextual insights from LLMs with multi-model similarity knowledge. Across 6 public benchmark datasets, extensive testing of 760 distinct predictive pipelines demonstrates that hybrid predictive pipelines consistently outperforms other two types predictive pipelines in accurately identifying duplicate and conflicting requirements. This predictive pipeline outperformed existing state-of-the-art predictors performance with an overall performance margin of 13% in terms of F1-score

PassionNet: An Innovative Framework for Duplicate and Conflicting Requirements Identification

TL;DR

PassionNet tackles the duplicate and conflicting requirements problem in Requirements Engineering by marrying semantic insights from large language models with traditional and advanced similarity measures. It defines three predictive pipeline families, culminating in a hybrid approach that fuses LLM context with multimodel similarity knowledge. Across six public benchmarks, hybrid pipelines consistently outperform the other types, achieving up to 13% higher macro F1 than prior state-of-the-art predictors. The framework demonstrates robust performance across diverse datasets and holds promise for broad applicability in software engineering and other textual similarity tasks.

Abstract

Early detection and resolution of duplicate and conflicting requirements can significantly enhance project efficiency and overall software quality. Researchers have developed various computational predictors by leveraging Artificial Intelligence (AI) potential to detect duplicate and conflicting requirements. However, these predictors lack in performance and requires more effective approaches to empower software development processes. Following the need of a unique predictor that can accurately identify duplicate and conflicting requirements, this research offers a comprehensive framework that facilitate development of 3 different types of predictive pipelines: language models based, multi-model similarity knowledge-driven and large language models (LLMs) context + multi-model similarity knowledge-driven. Within first type predictive pipelines landscape, framework facilitates conflicting/duplicate requirements identification by leveraging 8 distinct types of LLMs. In second type, framework supports development of predictive pipelines that leverage multi-scale and multi-model similarity knowledge, ranging from traditional similarity computation methods to advanced similarity vectors generated by LLMs. In the third type, the framework synthesizes predictive pipelines by integrating contextual insights from LLMs with multi-model similarity knowledge. Across 6 public benchmark datasets, extensive testing of 760 distinct predictive pipelines demonstrates that hybrid predictive pipelines consistently outperforms other two types predictive pipelines in accurately identifying duplicate and conflicting requirements. This predictive pipeline outperformed existing state-of-the-art predictors performance with an overall performance margin of 13% in terms of F1-score

Paper Structure

This paper contains 17 sections, 3 equations, 8 figures, 3 tables.

Figures (8)

  • Figure 1: A High-Level Overview of PassionNet Framework
  • Figure 2: Comparison of different algorithms: VSM, LSI, JSD, and NMF.
  • Figure 3: LDA
  • Figure 4: Description of Conflict and Duplicate Detection Datasets
  • Figure 5: Confusion matrix
  • ...and 3 more figures