Table of Contents
Fetching ...

How to Sustain a Scientific Open-Source Software Ecosystem: Learning from the Astropy Project

Jiayi Sun, Aarya Patil, Youhai Li, Jin L. C. Guo, Shurui Zhou

TL;DR

This paper investigates the sustainability of scientific open-source software through a detailed case study of the Astropy Project, examining how interdisciplinary collaboration, open-source dynamics, and cross-project ecosystems shape long-term viability. Using interviews, surveys, and mining of cross-referenced issues/PRs, it reveals persistent challenges in onboarding, retention, and equitable recognition of engineering work, along with four key modes of cross-project collaboration. The authors propose concrete practices—enhanced onboarding tools, better attribution of engineering contributions, and centralized infrastructure governance—and identify directions for tooling and research to support knowledge sharing and maintenance across scientific OSS ecosystems. The findings highlight unique sustainability considerations in scientific OSS and offer actionable guidance for developers, researchers, and funding agencies to sustain software and communities that underpin reproducible science.

Abstract

Scientific open-source software (OSS) has greatly benefited research communities through its transparent and collaborative nature. Given its critical role in scientific research, ensuring the sustainability of such software has become vital. Earlier studies have proposed sustainability strategies for conventional scientific software and open-source communities. However, it remains unclear whether these solutions can be easily adapted to the integrated framework of scientific OSS and its larger ecosystem. This study examines the challenges and opportunities to enhance the sustainability of scientific OSS in the context of interdisciplinary collaboration, open-source community, and multi-project ecosystem. We conducted a case study on a widely-used software ecosystem in the astrophysics domain, the Astropy Project, using a mixed-methods design approach. This approach includes an interview with core contributors regarding their participation in an interdisciplinary team, a survey of disengaged contributors about their motivations for contribution, reasons for disengagement, and suggestions for sustaining the communities, and finally, an analysis of cross-referenced issues and pull requests to understand best practices for collaboration on the ecosystem level. Our study reveals the implications of major challenges for sustaining scientific OSS and proposes concrete suggestions for tackling these challenges.

How to Sustain a Scientific Open-Source Software Ecosystem: Learning from the Astropy Project

TL;DR

This paper investigates the sustainability of scientific open-source software through a detailed case study of the Astropy Project, examining how interdisciplinary collaboration, open-source dynamics, and cross-project ecosystems shape long-term viability. Using interviews, surveys, and mining of cross-referenced issues/PRs, it reveals persistent challenges in onboarding, retention, and equitable recognition of engineering work, along with four key modes of cross-project collaboration. The authors propose concrete practices—enhanced onboarding tools, better attribution of engineering contributions, and centralized infrastructure governance—and identify directions for tooling and research to support knowledge sharing and maintenance across scientific OSS ecosystems. The findings highlight unique sustainability considerations in scientific OSS and offer actionable guidance for developers, researchers, and funding agencies to sustain software and communities that underpin reproducible science.

Abstract

Scientific open-source software (OSS) has greatly benefited research communities through its transparent and collaborative nature. Given its critical role in scientific research, ensuring the sustainability of such software has become vital. Earlier studies have proposed sustainability strategies for conventional scientific software and open-source communities. However, it remains unclear whether these solutions can be easily adapted to the integrated framework of scientific OSS and its larger ecosystem. This study examines the challenges and opportunities to enhance the sustainability of scientific OSS in the context of interdisciplinary collaboration, open-source community, and multi-project ecosystem. We conducted a case study on a widely-used software ecosystem in the astrophysics domain, the Astropy Project, using a mixed-methods design approach. This approach includes an interview with core contributors regarding their participation in an interdisciplinary team, a survey of disengaged contributors about their motivations for contribution, reasons for disengagement, and suggestions for sustaining the communities, and finally, an analysis of cross-referenced issues and pull requests to understand best practices for collaboration on the ecosystem level. Our study reveals the implications of major challenges for sustaining scientific OSS and proposes concrete suggestions for tackling these challenges.
Paper Structure (31 sections, 7 figures, 4 tables)

This paper contains 31 sections, 7 figures, 4 tables.

Figures (7)

  • Figure 1: The scientific Python ecosystem. For each dot, the size represents the number of contributors to the package, the color gradient indicates the commits count of the package, and the color of the text labels demonstrates the category of the packages.
  • Figure 2: Research Method Overview.
  • Figure 3: Cross-reference events example and the corresponding Cross-Reference Graph (CRG), where the nodes represent issues/PRs, and edges depict connections between source and target nodes from three projects.
  • Figure 4: Composition of contribution types of the 41 core contributors. Each dot represents a contributor and size reflects the number of commits. For each dot, $(i\%, d\%, c)$ represents the ratio of engineering-related contribution, science-related contribution, and the total number of commits.
  • Figure 5: Mapping between motivations and reasons of disengagement from contributors.
  • ...and 2 more figures