A Framework for Supporting the Reproducibility of Computational Experiments in Multiple Scientific Domains

Lázaro Costa; Susana Barbosa; Jácome Cunha

A Framework for Supporting the Reproducibility of Computational Experiments in Multiple Scientific Domains

Lázaro Costa, Susana Barbosa, Jácome Cunha

TL;DR

The paper tackles the pervasive problem of reproducibility in computational science and introduces SciRep, a Docker-based framework that automatically configures, executes, and packages experiments into portable research artifacts. By inferring languages and dependencies, supporting multiple databases, and providing an API for external tooling, SciRep enables cross-domain reproducibility and replicability with verifiable results. An extensive evaluation across 28 experiments from software engineering, databases, climate, medicine, and AI demonstrates that SciRep can reproduce 16 of 18 runnable experiments (89%), with 100% reproducibility on executed runs, outperforming Code Ocean, RenkuLab, and Whole Tale on the tested set. The framework thus advances open science by simplifying the sharing and re-execution of complex computational studies, while acknowledging limitations in metadata, UI, and hardware considerations that guide future work.

Abstract

In recent years, the research community, but also the general public, has raised serious questions about the reproducibility and replicability of scientific work. Since many studies include some kind of computational work, these issues are also a technological challenge, not only in computer science, but also in most research domains. Computational replicability and reproducibility are not easy to achieve due to the variety of computational environments that can be used. Indeed, it is challenging to recreate the same environment via the same frameworks, code, programming languages, dependencies, and so on. We propose a framework, known as SciRep, that supports the configuration, execution, and packaging of computational experiments by defining their code, data, programming languages, dependencies, databases, and commands to be executed. After the initial configuration, the experiments can be executed any number of times, always producing exactly the same results. Our approach allows the creation of a reproducibility package for experiments from multiple scientific fields, from medicine to computer science, which can be re-executed on any computer. The produced package acts as a capsule, holding absolutely everything necessary to re-execute the experiment. To evaluate our framework, we compare it with three state-of-the-art tools and use it to reproduce 18 experiments extracted from published scientific articles. With our approach, we were able to execute 16 (89%) of those experiments, while the others reached only 61%, thus showing that our approach is effective. Moreover, all the experiments that were executed produced the results presented in the original publication. Thus, SciRep was able to reproduce 100% of the experiments it could run.

A Framework for Supporting the Reproducibility of Computational Experiments in Multiple Scientific Domains

TL;DR

Abstract

A Framework for Supporting the Reproducibility of Computational Experiments in Multiple Scientific Domains

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (2)