Table of Contents
Fetching ...

Research Artifacts in Software Engineering Publications: Status and Trends

Mugeng Liu, Xiaolong Huang, Wei He, Yibing Xie, Jie M. Zhang, Xiang Jing, Zhenpeng Chen, Yun Ma

TL;DR

The paper investigates the status and trends of research artifacts in software engineering publications to understand how artifacts are prepared, documented, maintained, and consumed. By manually collecting 1,487 artifacts from 2,196 papers across ASE, FSE, ICSE, and ISSTA (2017–2022) and applying both manual labeling and automated code-smell analysis (Python via Pylint and Java via PMD), it reveals a rising adoption of artifacts, increasing use of Zenodo, and a persistent gap between artifact quality and reuse in real-world SE applications. The study provides concrete metrics on storage platforms, URL visibility, content diversity, documentation and testing practices, maintenance patterns, and the popularity of artifacts (stars), highlighting opportunities to improve reproducibility and guidance for future artifact construction. It also documents challenges such as dead URLs and code-quality concerns, offering recommendations for better artifact preparation, quality assurance, and long-term maintenance. Overall, the work informs researchers, conference organizers, and practitioners about current artifact practices and practical steps to enhance their reproducibility and impact.

Abstract

The Software Engineering (SE) community has been embracing the open science policy and encouraging researchers to disclose artifacts in their publications. However, the status and trends of artifact practice and quality remain unclear, lacking insights on further improvement. In this paper, we present an empirical study to characterize the research artifacts in SE publications. Specifically, we manually collect 1,487 artifacts from all 2,196 papers published in top-tier SE conferences (ASE, FSE, ICSE, and ISSTA) from 2017 to 2022. We investigate the common practices (e.g., URL location and format, storage websites), maintenance activities (e.g., last update time and URL validity), popularity (e.g., the number of stars on GitHub and characteristics), and quality (e.g., documentation and code smell) of these artifacts. Based on our analysis, we reveal a rise in publications providing artifacts. The usage of Zenodo for sharing artifacts has significantly increased. However, artifacts stored in GitHub tend to receive few stars, indicating a limited influence on real-world SE applications. We summarize the results and provide suggestions to different stakeholders in conjunction with current guidelines.

Research Artifacts in Software Engineering Publications: Status and Trends

TL;DR

The paper investigates the status and trends of research artifacts in software engineering publications to understand how artifacts are prepared, documented, maintained, and consumed. By manually collecting 1,487 artifacts from 2,196 papers across ASE, FSE, ICSE, and ISSTA (2017–2022) and applying both manual labeling and automated code-smell analysis (Python via Pylint and Java via PMD), it reveals a rising adoption of artifacts, increasing use of Zenodo, and a persistent gap between artifact quality and reuse in real-world SE applications. The study provides concrete metrics on storage platforms, URL visibility, content diversity, documentation and testing practices, maintenance patterns, and the popularity of artifacts (stars), highlighting opportunities to improve reproducibility and guidance for future artifact construction. It also documents challenges such as dead URLs and code-quality concerns, offering recommendations for better artifact preparation, quality assurance, and long-term maintenance. Overall, the work informs researchers, conference organizers, and practitioners about current artifact practices and practical steps to enhance their reproducibility and impact.

Abstract

The Software Engineering (SE) community has been embracing the open science policy and encouraging researchers to disclose artifacts in their publications. However, the status and trends of artifact practice and quality remain unclear, lacking insights on further improvement. In this paper, we present an empirical study to characterize the research artifacts in SE publications. Specifically, we manually collect 1,487 artifacts from all 2,196 papers published in top-tier SE conferences (ASE, FSE, ICSE, and ISSTA) from 2017 to 2022. We investigate the common practices (e.g., URL location and format, storage websites), maintenance activities (e.g., last update time and URL validity), popularity (e.g., the number of stars on GitHub and characteristics), and quality (e.g., documentation and code smell) of these artifacts. Based on our analysis, we reveal a rise in publications providing artifacts. The usage of Zenodo for sharing artifacts has significantly increased. However, artifacts stored in GitHub tend to receive few stars, indicating a limited influence on real-world SE applications. We summarize the results and provide suggestions to different stakeholders in conjunction with current guidelines.
Paper Structure (34 sections, 11 figures, 3 tables)

This paper contains 34 sections, 11 figures, 3 tables.

Figures (11)

  • Figure 1: Jie says: to discuss: reorganise the graphThe number of papers with and without artefacts in each conference from 2017 to 2021.
  • Figure 2: Jie says: use bar chartsThe ratio of storage websites that artefacts used from 2017 to 2021.
  • Figure 3: The ratio of the position that URL of artefact is provided in a paper.
  • Figure 4: The number of artefacts that consist of different kinds of content.
  • Figure 5: The number of the programming language used in artefacts.Jie says: revise, detaileds suggestions in wechat group
  • ...and 6 more figures