Table of Contents
Fetching ...

An Exploratory Study of Documentation Strategies for Product Features in Popular GitHub Projects

Tim Puhlfürß, Lloyd Montgomery, Walid Maalej

TL;DR

The paper addresses the problem of fragmented feature knowledge in open-source software on GitHub by performing a qualitative exploratory content analysis of 25 popular projects that meet GitHub Community Standards. It identifies six textual artefact types and nine descriptive element types used to document product features, and analyzes six strategies for linking features to code and four for linking code back to features, revealing generally weak traceability. The study contributes an empirical characterization of current documentation practices and highlights gaps that hinder maintainability and comprehension. It suggests practical directions for guidelines and tooling to improve feature documentation and traceability in OSS projects.

Abstract

[Background] In large open-source software projects, development knowledge is often fragmented across multiple artefacts and contributors such that individual stakeholders are generally unaware of the full breadth of the product features. However, users want to know what the software is capable of, while contributors need to know where to fix, update, and add features. [Objective] This work aims at understanding how feature knowledge is documented in GitHub projects and how it is linked (if at all) to the source code. [Method] We conducted an in-depth qualitative exploratory content analysis of 25 popular GitHub repositories that provided the documentation artefacts recommended by GitHub's Community Standards indicator. We first extracted strategies used to document software features in textual artefacts and then strategies used to link the feature documentation with source code. [Results] We observed feature documentation in all studied projects in artefacts such as READMEs, wikis, and website resource files. However, the features were often described in an unstructured way. Additionally, tracing techniques to connect feature documentation and source code were rarely used. [Conclusions] Our results suggest a lacking (or a low-prioritised) feature documentation in open-source projects, little use of normalised structures, and a rare explicit referencing to source code. As a result, product feature traceability is likely to be very limited, and maintainability to suffer over time.

An Exploratory Study of Documentation Strategies for Product Features in Popular GitHub Projects

TL;DR

The paper addresses the problem of fragmented feature knowledge in open-source software on GitHub by performing a qualitative exploratory content analysis of 25 popular projects that meet GitHub Community Standards. It identifies six textual artefact types and nine descriptive element types used to document product features, and analyzes six strategies for linking features to code and four for linking code back to features, revealing generally weak traceability. The study contributes an empirical characterization of current documentation practices and highlights gaps that hinder maintainability and comprehension. It suggests practical directions for guidelines and tooling to improve feature documentation and traceability in OSS projects.

Abstract

[Background] In large open-source software projects, development knowledge is often fragmented across multiple artefacts and contributors such that individual stakeholders are generally unaware of the full breadth of the product features. However, users want to know what the software is capable of, while contributors need to know where to fix, update, and add features. [Objective] This work aims at understanding how feature knowledge is documented in GitHub projects and how it is linked (if at all) to the source code. [Method] We conducted an in-depth qualitative exploratory content analysis of 25 popular GitHub repositories that provided the documentation artefacts recommended by GitHub's Community Standards indicator. We first extracted strategies used to document software features in textual artefacts and then strategies used to link the feature documentation with source code. [Results] We observed feature documentation in all studied projects in artefacts such as READMEs, wikis, and website resource files. However, the features were often described in an unstructured way. Additionally, tracing techniques to connect feature documentation and source code were rarely used. [Conclusions] Our results suggest a lacking (or a low-prioritised) feature documentation in open-source projects, little use of normalised structures, and a rare explicit referencing to source code. As a result, product feature traceability is likely to be very limited, and maintainability to suffer over time.
Paper Structure (6 sections, 4 figures, 1 table)

This paper contains 6 sections, 4 figures, 1 table.

Figures (4)

  • Figure 1: Textual artefacts that document product features
  • Figure 2: Descriptive elements to document product features
  • Figure 3: Strategies to link product features to source code
  • Figure 4: Strategies to link source code to product features