Table of Contents
Fetching ...

Making mathematical online resources FAIR: at the example of small phylogenetic trees

Tabea Bacher, Marina Garrote-López, Christiane Görgen, Marius J. Neubert

TL;DR

The paper addresses the challenge of making online mathematical resources FAIR and reproducible, using Small Phylogenetic Trees as a case study. It advocates a threefold approach combining (i) a reproducible software package built in Julia/OSCAR to certify results, (ii) a new website with theory, code, and computations linked transparently, and (iii) comprehensive process documentation to provide generalizable guidelines. Central mathematical constructs such as the polynomial map $\phi_T$, the variety $V_T$, and its invariants $I(V_T)$ are used to illustrate how algebraic-statistical models on trees relate to toric varieties, and how serialization via $MRDI$ and YAML metadata enhances machine readability and long-term reuse. The work demonstrates concrete improvements in findability, interoperability, and reproducibility through domain-specific citations, Zenodo snapshots, and explicit software licensing, providing a blueprint for sustaining online mathematics and paving the way for future expansion to additional models and networks.

Abstract

We report on the process of taking an early 2000's mathematical library, the Small Phylogenetic Trees, and transforming it into a FAIR, modern, and sustainable repository for data from algebraic phylogenetics. This process is based on a three-fold strategy: (1) writing a software package which enables the user to reproduce results of the database; (2) setting up a user-friendly new website with cross links to theoretical publications, code snippets, and serialized output of computations; and (3) all-the-while documenting the steps we take in order to derive lessons learned which may be generalised to other such projects. This paper addresses (3). (1) is found in https://docs.oscar-system.org/dev/Experimental/AlgebraicStatistics/phylogenetics, and (2) is located at https://algebraicphylogenetics.org.

Making mathematical online resources FAIR: at the example of small phylogenetic trees

TL;DR

The paper addresses the challenge of making online mathematical resources FAIR and reproducible, using Small Phylogenetic Trees as a case study. It advocates a threefold approach combining (i) a reproducible software package built in Julia/OSCAR to certify results, (ii) a new website with theory, code, and computations linked transparently, and (iii) comprehensive process documentation to provide generalizable guidelines. Central mathematical constructs such as the polynomial map , the variety , and its invariants are used to illustrate how algebraic-statistical models on trees relate to toric varieties, and how serialization via and YAML metadata enhances machine readability and long-term reuse. The work demonstrates concrete improvements in findability, interoperability, and reproducibility through domain-specific citations, Zenodo snapshots, and explicit software licensing, providing a blueprint for sustaining online mathematics and paving the way for future expansion to additional models and networks.

Abstract

We report on the process of taking an early 2000's mathematical library, the Small Phylogenetic Trees, and transforming it into a FAIR, modern, and sustainable repository for data from algebraic phylogenetics. This process is based on a three-fold strategy: (1) writing a software package which enables the user to reproduce results of the database; (2) setting up a user-friendly new website with cross links to theoretical publications, code snippets, and serialized output of computations; and (3) all-the-while documenting the steps we take in order to derive lessons learned which may be generalised to other such projects. This paper addresses (3). (1) is found in https://docs.oscar-system.org/dev/Experimental/AlgebraicStatistics/phylogenetics, and (2) is located at https://algebraicphylogenetics.org.
Paper Structure (19 sections, 4 figures)

This paper contains 19 sections, 4 figures.

Figures (4)

  • Figure 1: Screenshots of Small Phylogenetic Trees of January 2025. The first (a) shows the landing page, the second (b) an overview of all trees considered, and the third (c; after having navigated through two more subpages) data on the three-leaf phylogenetic tree with Jukes-Cantor model.
  • Figure 2: This is a data layout plan, showing which mathematical and computational objects appear on Small Phylogenetic Trees, detailing their interdependencies and provenance. We use it to decide how to build the software package and which information to display on the website. CAS is an abbreviation for computer algebra system.
  • Figure 3: Focusing on the mathematics first and on their display second, we make two research data management plans: one for the software project and one for the corresponding website.
  • Figure 4: This is a screenshot of https://algebraicphylogenetics.org under development in January 2025: on the left is the landing page, on the right data on the three-leaf phylogenetic tree with Jukes-Cantor model.