Table of Contents
Fetching ...

High-impact Scientific Software in Astronomy and its creators

Johannes Buchner

TL;DR

The results indicate that there are currently over 200 people active on any given day to improve software in astronomy, and it is found that half of the mapped development is through US-affiliated institutes, and a large number of high-impact projects are led by a single person.

Abstract

In the last decades, scientific software has graduated from a hidden side-product to a first-class member of the astrophysics literature. We aim to quantify the activity and impact of software development for astronomy, using a systematic survey. Starting from the Astrophysics Source Code Library and the Journal of Open Source Software, we analyse 3432 public git-based scientific software packages. Paper abstract text analysis suggests seven dominant themes: cosmology, data reduction pipelines, exoplanets, hydrodynamic simulations, radiative transfer spectra simulation, statistical inference and galaxies. We present key individual software contributors, their affiliated institutes and countries of high-impact software in astronomy & astrophysics. We consider the number of citations to papers using the software and the number of person-days from their git repositories, as proxies for impact and complexity, respectively. We find that half of the mapped development is through US-affiliated institutes, and a large number of high-impact projects are led by a single person. Our results indicate that there are currently over 200 people active on any given day to improve software in astronomy.

High-impact Scientific Software in Astronomy and its creators

TL;DR

The results indicate that there are currently over 200 people active on any given day to improve software in astronomy, and it is found that half of the mapped development is through US-affiliated institutes, and a large number of high-impact projects are led by a single person.

Abstract

In the last decades, scientific software has graduated from a hidden side-product to a first-class member of the astrophysics literature. We aim to quantify the activity and impact of software development for astronomy, using a systematic survey. Starting from the Astrophysics Source Code Library and the Journal of Open Source Software, we analyse 3432 public git-based scientific software packages. Paper abstract text analysis suggests seven dominant themes: cosmology, data reduction pipelines, exoplanets, hydrodynamic simulations, radiative transfer spectra simulation, statistical inference and galaxies. We present key individual software contributors, their affiliated institutes and countries of high-impact software in astronomy & astrophysics. We consider the number of citations to papers using the software and the number of person-days from their git repositories, as proxies for impact and complexity, respectively. We find that half of the mapped development is through US-affiliated institutes, and a large number of high-impact projects are led by a single person. Our results indicate that there are currently over 200 people active on any given day to improve software in astronomy.

Paper Structure

This paper contains 10 sections, 1 equation, 6 figures.

Figures (6)

  • Figure 1: Sample distribution of the impact of scientific software. Impact is the total number of citations to papers using the software.
  • Figure 2: Landscape of astronomical software. Coloured contours and rectangles identify the seven principal themes from our bag-of-words analysis. In the shown 2-dimensional projection, representative software packages are named, and color-coded by most important theme. Each code is marked by a small cross.
  • Figure 3: Comparing project impact and contribution per author. Some individual points are annotated with project name and most-contributing author. astropy is in orange, starlink in blue, yt in green.
  • Figure 4: A tree map visualisation of astronomical software. Each white rectangle identifies one author. The rectangle is proportional to scientific impact multiplied by number of days spent. Each black rectangle is one software, with the name at the top left. The smallest are excluded in this figure.
  • Figure 5: Same as \ref{['fig:withouttiny']}, but removing the top 30 packages, and showing smaller ones.
  • ...and 1 more figures