Table of Contents
Fetching ...

The Nature of Technical Debt in Research Software

Neil A. Ernst, Ahmed Musa Awon, Swapnil Hingmire, Ze Shi Li

Abstract

Research software (also called scientific software) is essential for advancing scientific endeavours. Research software encapsulates complex algorithms and domain-specific knowledge and is a fundamental component of all science. A pervasive challenge in developing research software is technical debt, which can adversely affect reliability, maintainability, and scientific validity. Research software often relies on the initiative of the scientific community for maintenance, requiring diverse expertise in both scientific and software engineering domains. The extent and nature of technical debt in research software are little studied, in particular, what forms it takes, and what the science teams developing this software think about their technical debt. In this paper we describe our multi-method study examining technical debt in research software. We begin by examining instances of self-reported technical debt in research code, examining 28k code comments across nine research software projects. Then, building on our findings, we interview research software engineers and scientists about how this technical debt manifests itself in their experience, and what costs it has for research software and research outputs more generally. We identify nine types of self-admitted technical debt unique to research software, and four themes impacting this technical debt.

The Nature of Technical Debt in Research Software

Abstract

Research software (also called scientific software) is essential for advancing scientific endeavours. Research software encapsulates complex algorithms and domain-specific knowledge and is a fundamental component of all science. A pervasive challenge in developing research software is technical debt, which can adversely affect reliability, maintainability, and scientific validity. Research software often relies on the initiative of the scientific community for maintenance, requiring diverse expertise in both scientific and software engineering domains. The extent and nature of technical debt in research software are little studied, in particular, what forms it takes, and what the science teams developing this software think about their technical debt. In this paper we describe our multi-method study examining technical debt in research software. We begin by examining instances of self-reported technical debt in research code, examining 28k code comments across nine research software projects. Then, building on our findings, we interview research software engineers and scientists about how this technical debt manifests itself in their experience, and what costs it has for research software and research outputs more generally. We identify nine types of self-admitted technical debt unique to research software, and four themes impacting this technical debt.
Paper Structure (40 sections, 6 figures, 6 tables)

This paper contains 40 sections, 6 figures, 6 tables.

Figures (6)

  • Figure 1: Overview of our Convergent Parallel Mixed Methods research approach. After storeyGuidelinesUsingMixed2024
  • Figure 2: Percentage of SATD types across selected research projects, scientific debt highlighted.
  • Figure 3: Percentage of Scientific Debt indicators across research software
  • Figure 4: Thematic maps showing how our analysis evolved.
  • Figure 5: Illustrative knowledge profiles for two hypothesized team member archetypes, based on Kelly's domains kelly_scientific_2015. These are not empirical measurements.
  • ...and 1 more figures