Table of Contents
Fetching ...

A transfer learning approach for automatic conflicts detection in software requirement sentence pairs based on dual encoders

Yizheng Wang, Tao Jiang, Jinyan Bai, Zhengbin Zou, Tiancheng Xue, Nan Zhang, Jie Luan

TL;DR

This paper proposes a Transferable Software Requirement Conflict Detection Framework based on SBERT and SimCSE, termed TSRCDF-SS, which synergistically integrates sequential and cross-domain transfer learning.

Abstract

Software Requirement Document (RD) typically contain tens of thousands of individual requirements, and ensuring consistency among these requirements is critical for the success of software engineering projects. Automated detection methods can significantly enhance efficiency and reduce costs; however, existing approaches still face several challenges, including low detection accuracy on imbalanced data, limited semantic extraction due to the use of a single encoder, and suboptimal performance in cross-domain transfer learning. To address these issues, this paper proposes a Transferable Software Requirement Conflict Detection Framework based on SBERT and SimCSE, termed TSRCDF-SS. First, the framework employs two independent encoders, Sentence-BERT (SBERT) and Simple Contrastive Sentence Embedding (SimCSE), to generate sentence embeddings for requirement pairs, followed by a six-element concatenation strategy. Furthermore, the classifier is enhanced by a two-layer fully connected feedforward neural network (FFNN) with a hybrid loss optimization strategy that integrates a variant of Focal Loss, domain-specific constraints, and a confidence-based penalty term. Finally, the framework synergistically integrates sequential and cross-domain transfer learning. Experimental results demonstrate that the proposed framework achieves a 10.4% improvement in both macro-F1 and weighted-F1 scores in in-domain settings, and an 11.4% increase in macro-F1 in cross-domain scenarios.

A transfer learning approach for automatic conflicts detection in software requirement sentence pairs based on dual encoders

TL;DR

This paper proposes a Transferable Software Requirement Conflict Detection Framework based on SBERT and SimCSE, termed TSRCDF-SS, which synergistically integrates sequential and cross-domain transfer learning.

Abstract

Software Requirement Document (RD) typically contain tens of thousands of individual requirements, and ensuring consistency among these requirements is critical for the success of software engineering projects. Automated detection methods can significantly enhance efficiency and reduce costs; however, existing approaches still face several challenges, including low detection accuracy on imbalanced data, limited semantic extraction due to the use of a single encoder, and suboptimal performance in cross-domain transfer learning. To address these issues, this paper proposes a Transferable Software Requirement Conflict Detection Framework based on SBERT and SimCSE, termed TSRCDF-SS. First, the framework employs two independent encoders, Sentence-BERT (SBERT) and Simple Contrastive Sentence Embedding (SimCSE), to generate sentence embeddings for requirement pairs, followed by a six-element concatenation strategy. Furthermore, the classifier is enhanced by a two-layer fully connected feedforward neural network (FFNN) with a hybrid loss optimization strategy that integrates a variant of Focal Loss, domain-specific constraints, and a confidence-based penalty term. Finally, the framework synergistically integrates sequential and cross-domain transfer learning. Experimental results demonstrate that the proposed framework achieves a 10.4% improvement in both macro-F1 and weighted-F1 scores in in-domain settings, and an 11.4% increase in macro-F1 in cross-domain scenarios.

Paper Structure

This paper contains 22 sections, 6 equations, 7 figures, 3 tables.

Figures (7)

  • Figure 1: TSRCDF-SS structure diagram. The TSRCDF-SS structure diagram includes a dual encoder, and the improved classifier combines sequential transfer and cross-domain transfer.
  • Figure 2: Encoder t-SNE dimensionality reduction projection. Comparison of sentence embedding performance of different encoders and combined encoders using t-SNE dimensionality reduction projection. This visualization highlights the differences in encoding capabilities of different encoders. The dataset used is TRAINNLI.
  • Figure 3: Software requirement pair encoding model based on SBERT and SimCSE. Use the SBERT model and SimCSE model to encode individual requirements to obtain their respective embeddings. Then these embeddings are fused to finally obtain the six-element concatenated embedding result.
  • Figure 4: The pseudocode of fusion algorithm of sequential transfer and cross-domain transfer.
  • Figure 5: Comparison of encoder combination results. Precision, recall and F1 of different encoder combination experiments on TRAINNLI (30000) dataset.
  • ...and 2 more figures