Table of Contents
Fetching ...

Requirements Quality Research Artifacts: Recovery, Analysis, and Management Guideline

Julian Frattini, Lloyd Montgomery, Davide Fucci, Michael Unterkalmsteiner, Daniel Mendez, Jannik Fischbach

TL;DR

Requirements quality research relies on artifacts (data sets and implementations) to benchmark and transfer methods, but many artifacts are unavailable or undisclosed, hindering reproducibility. This work extends a two-phase artifact recovery initiative to recover 10 data sets and 7 implementations and employs Bayesian data analysis to identify factors affecting artifact availability, persistence, and recoverability, while also delivering a pragmatic Open Science Artifact Management Guideline. The results show that artifact availability is improving over time and that public hosting services significantly boost artifact persistence, underscoring the value of open science practices in the field. The study provides actionable guidance and recovered artifacts to enable reuse and reproduction, along with replication resources to advance open science across software engineering research.

Abstract

Requirements quality research, which is dedicated to assessing and improving the quality of requirements specifications, is dependent on research artifacts like data sets (containing information about quality defects) and implementations (automatically detecting and removing these defects). However, recent research exposed that the majority of these research artifacts have become unavailable or have never been disclosed, which inhibits progress in the research domain. In this work, we aim to improve the availability of research artifacts in requirements quality research. To this end, we (1) extend an artifact recovery initiative, (2) empirically evaluate the reasons for artifact unavailability using Bayesian data analysis, and (3) compile a concise guideline for open science artifact disclosure. Our results include 10 recovered data sets and 7 recovered implementations, empirical support for artifact availability improving over time and the positive effect of public hosting services, and a pragmatic artifact management guideline open for community comments. With this work, we hope to encourage and support adherence to open science principles and improve the availability of research artifacts for the requirements research quality community.

Requirements Quality Research Artifacts: Recovery, Analysis, and Management Guideline

TL;DR

Requirements quality research relies on artifacts (data sets and implementations) to benchmark and transfer methods, but many artifacts are unavailable or undisclosed, hindering reproducibility. This work extends a two-phase artifact recovery initiative to recover 10 data sets and 7 implementations and employs Bayesian data analysis to identify factors affecting artifact availability, persistence, and recoverability, while also delivering a pragmatic Open Science Artifact Management Guideline. The results show that artifact availability is improving over time and that public hosting services significantly boost artifact persistence, underscoring the value of open science practices in the field. The study provides actionable guidance and recovered artifacts to enable reuse and reproduction, along with replication resources to advance open science across software engineering research.

Abstract

Requirements quality research, which is dedicated to assessing and improving the quality of requirements specifications, is dependent on research artifacts like data sets (containing information about quality defects) and implementations (automatically detecting and removing these defects). However, recent research exposed that the majority of these research artifacts have become unavailable or have never been disclosed, which inhibits progress in the research domain. In this work, we aim to improve the availability of research artifacts in requirements quality research. To this end, we (1) extend an artifact recovery initiative, (2) empirically evaluate the reasons for artifact unavailability using Bayesian data analysis, and (3) compile a concise guideline for open science artifact disclosure. Our results include 10 recovered data sets and 7 recovered implementations, empirical support for artifact availability improving over time and the positive effect of public hosting services, and a pragmatic artifact management guideline open for community comments. With this work, we hope to encourage and support adherence to open science principles and improve the availability of research artifacts for the requirements research quality community.
Paper Structure (42 sections, 14 figures, 6 tables)

This paper contains 42 sections, 14 figures, 6 tables.

Figures (14)

  • Figure 1: Overview of the two-phase recovery process
  • Figure 2: Status of correspondence
  • Figure 3: Distribution of time for correspondence in days (excluding outliers)
  • Figure 4: Distribution of frequency of correspondence in the number of emails
  • Figure 5: Change of availability in data sets
  • ...and 9 more figures