Table of Contents
Fetching ...

Enhancing Requirements Traceability Link Recovery: A Novel Approach with T-SimCSE

Ye Wang, Wenqing Wang, Kun Hu, Qiao Huang, Liping Zhao

Abstract

Requirements traceability plays an important role in ensuring software quality and responding to changes in requirements. Requirements trace links (such as the links between requirements and other software artifacts) underpin the modeling and implementation of requirements traceability. With the rapid development of artificial intelligence, more and more pre-trained language models (PLMs) techniques are applied to the automatic recovery of requirements trace links. However, the requirements traceability links recovered by these approaches are not accurate enough, and many approaches require a large labeled dataset for training. Currently, there are very few labeled datasets available. To address these limitations, this paper proposes a novel requirements traceability link recovery approach called T-SimCSE, which is based on a PLM -- SimCSE. SimCSE has the advantages of not requiring labeled data, having broad applicability, and performing well. T-SimCSE firstly uses the SimCSE model to calculate the similarity between requirements and target artifacts, and employs a new metric (i.e. specificity) to reorder those target artifacts. Finally, the trace links are created between the requirement and the top-K target artifacts. We have evaluated T-SimCSE on ten public datasets by comparing them with other approaches. The results show that T-SimCSE achieves superior performance in terms of recall and Mean Average Precision (MAP).

Enhancing Requirements Traceability Link Recovery: A Novel Approach with T-SimCSE

Abstract

Requirements traceability plays an important role in ensuring software quality and responding to changes in requirements. Requirements trace links (such as the links between requirements and other software artifacts) underpin the modeling and implementation of requirements traceability. With the rapid development of artificial intelligence, more and more pre-trained language models (PLMs) techniques are applied to the automatic recovery of requirements trace links. However, the requirements traceability links recovered by these approaches are not accurate enough, and many approaches require a large labeled dataset for training. Currently, there are very few labeled datasets available. To address these limitations, this paper proposes a novel requirements traceability link recovery approach called T-SimCSE, which is based on a PLM -- SimCSE. SimCSE has the advantages of not requiring labeled data, having broad applicability, and performing well. T-SimCSE firstly uses the SimCSE model to calculate the similarity between requirements and target artifacts, and employs a new metric (i.e. specificity) to reorder those target artifacts. Finally, the trace links are created between the requirement and the top-K target artifacts. We have evaluated T-SimCSE on ten public datasets by comparing them with other approaches. The results show that T-SimCSE achieves superior performance in terms of recall and Mean Average Precision (MAP).
Paper Structure (29 sections, 10 equations, 6 figures, 4 tables)

This paper contains 29 sections, 10 equations, 6 figures, 4 tables.

Figures (6)

  • Figure 1: A motivating example
  • Figure 2: The process of T-SimCSE
  • Figure 3: A rewarding example
  • Figure 4: Heatmaps of MAP for $k_1$ and $k_2$ in T-SimCSE across ten datasets
  • Figure 5: The comparison of the Precision-Recall curves of T-SimCSE and SimCSE
  • ...and 1 more figures