What is Reproducibility in Artificial Intelligence and Machine Learning Research?

Abhyuday Desai; Mohamed Abdelhamid; Nakul R. Padalkar

What is Reproducibility in Artificial Intelligence and Machine Learning Research?

Abhyuday Desai, Mohamed Abdelhamid, Nakul R. Padalkar

TL;DR

This paper addresses the reproducibility crisis in AI/ML by proposing a structured framework to clarify definitions of repeatability, reproducibility, and replicability and to map them onto a spectrum of validation rigor. It surveys existing terminologies and landscape, then introduces a hierarchy with dependent/independent reproducibility and direct/conceptual replicability, tied to research workflow components. The authors illustrate the framework through case studies, including incidents of data leakage, cross-institution generalization, SMOTE analyses, and foundation-model validation challenges. The framework aims to improve reliability and trust, guiding researchers to design robust validation studies and helping readers and practitioners assess claims in AI/ML research.

Abstract

In the rapidly evolving fields of Artificial Intelligence (AI) and Machine Learning (ML), the reproducibility crisis underscores the urgent need for clear validation methodologies to maintain scientific integrity and encourage advancement. The crisis is compounded by the prevalent confusion over validation terminology. In response to this challenge, we introduce a framework that clarifies the roles and definitions of key validation efforts: repeatability, dependent and independent reproducibility, and direct and conceptual replicability. This structured framework aims to provide AI/ML researchers with the necessary clarity on these essential concepts, facilitating the appropriate design, conduct, and interpretation of validation studies. By articulating the nuances and specific roles of each type of validation study, we aim to enhance the reliability and trustworthiness of research findings and support the community's efforts to address reproducibility challenges effectively.

What is Reproducibility in Artificial Intelligence and Machine Learning Research?

TL;DR

Abstract

Paper Structure (27 sections, 3 figures, 1 table)

This paper contains 27 sections, 3 figures, 1 table.

Introduction
Terminology Confusion
Motivation for Our Framework
AI/ML Research Context and Reproducibility Landscape
Scope and Definitions: AI and ML in Context
Machine Learning
Deep Learning
Foundation Models and Generative AI
Reproducibility Landscape: From Concepts to Implementation
Conceptual Frameworks and Definitions
Standards and Certification Systems
Implementation Tools and Platforms
The Reproducibility Framework
Components of Research Studies
The Validation Spectrum
...and 12 more sections

Figures (3)

Figure 1: Total number of papers submitted and accepted at NeurIPS, ICML, ICLR and AAAI from 2018 to 2023
Figure 2: The research publication workflow
Figure 3: Hierarchy of validation studies in research

What is Reproducibility in Artificial Intelligence and Machine Learning Research?

TL;DR

Abstract

What is Reproducibility in Artificial Intelligence and Machine Learning Research?

Authors

TL;DR

Abstract

Table of Contents

Figures (3)