Requirements Quality Research Artifacts: Recovery, Analysis, and Management Guideline
Julian Frattini, Lloyd Montgomery, Davide Fucci, Michael Unterkalmsteiner, Daniel Mendez, Jannik Fischbach
TL;DR
Requirements quality research relies on artifacts (data sets and implementations) to benchmark and transfer methods, but many artifacts are unavailable or undisclosed, hindering reproducibility. This work extends a two-phase artifact recovery initiative to recover 10 data sets and 7 implementations and employs Bayesian data analysis to identify factors affecting artifact availability, persistence, and recoverability, while also delivering a pragmatic Open Science Artifact Management Guideline. The results show that artifact availability is improving over time and that public hosting services significantly boost artifact persistence, underscoring the value of open science practices in the field. The study provides actionable guidance and recovered artifacts to enable reuse and reproduction, along with replication resources to advance open science across software engineering research.
Abstract
Requirements quality research, which is dedicated to assessing and improving the quality of requirements specifications, is dependent on research artifacts like data sets (containing information about quality defects) and implementations (automatically detecting and removing these defects). However, recent research exposed that the majority of these research artifacts have become unavailable or have never been disclosed, which inhibits progress in the research domain. In this work, we aim to improve the availability of research artifacts in requirements quality research. To this end, we (1) extend an artifact recovery initiative, (2) empirically evaluate the reasons for artifact unavailability using Bayesian data analysis, and (3) compile a concise guideline for open science artifact disclosure. Our results include 10 recovered data sets and 7 recovered implementations, empirical support for artifact availability improving over time and the positive effect of public hosting services, and a pragmatic artifact management guideline open for community comments. With this work, we hope to encourage and support adherence to open science principles and improve the availability of research artifacts for the requirements research quality community.
