Table of Contents
Fetching ...

BUMP: A Benchmark of Reproducible Breaking Dependency Updates

Frank Reyes, Yogya Gamage, Gabriel Skoglund, Benoit Baudry, Martin Monperrus

TL;DR

BUMP addresses the need for reproducible benchmarking of breaking dependency updates in Java/Maven projects. It presents an end-to-end methodology to construct a repository of 571 reproducible breaking updates drawn from 153 real-world projects, encapsulated as Docker images to guarantee long-term reproducibility across platforms. The work provides a detailed characterization of failure modes (compilation, test, enforcer, lock, and resolution) and analyzes the underlying library changes driving breakages, including a quantitative mapping via japicmp and a normalized breakage likelihood metric $\mathcal{S}$. By delivering a publicly available, fully reproducible benchmark, BUMP enables robust empirical research on dependency management, automatic repair, and compatibility analysis with practical impact for tooling and best practices.

Abstract

Third-party dependency updates can cause a build to fail if the new dependency version introduces a change that is incompatible with the usage: this is called a breaking dependency update. Research on breaking dependency updates is active, with works on characterization, understanding, automatic repair of breaking updates, and other software engineering aspects. All such research projects require a benchmark of breaking updates that has the following properties: 1) it contains real-world breaking updates; 2) the breaking updates can be executed; 3) the benchmark provides stable scientific artifacts of breaking updates over time, a property we call reproducibility. To the best of our knowledge, such a benchmark is missing. To address this problem, we present BUMP, a new benchmark that contains reproducible breaking dependency updates in the context of Java projects built with the Maven build system. BUMP contains 571 breaking dependency updates collected from 153 Java projects. BUMP ensures long-term reproducibility of dependency updates on different platforms, guaranteeing consistent build failures. We categorize the different causes of build breakage in BUMP, providing novel insights for future work on breaking update engineering. To our knowledge, BUMP is the first of its kind, providing hundreds of real-world breaking updates that have all been made reproducible.

BUMP: A Benchmark of Reproducible Breaking Dependency Updates

TL;DR

BUMP addresses the need for reproducible benchmarking of breaking dependency updates in Java/Maven projects. It presents an end-to-end methodology to construct a repository of 571 reproducible breaking updates drawn from 153 real-world projects, encapsulated as Docker images to guarantee long-term reproducibility across platforms. The work provides a detailed characterization of failure modes (compilation, test, enforcer, lock, and resolution) and analyzes the underlying library changes driving breakages, including a quantitative mapping via japicmp and a normalized breakage likelihood metric . By delivering a publicly available, fully reproducible benchmark, BUMP enables robust empirical research on dependency management, automatic repair, and compatibility analysis with practical impact for tooling and best practices.

Abstract

Third-party dependency updates can cause a build to fail if the new dependency version introduces a change that is incompatible with the usage: this is called a breaking dependency update. Research on breaking dependency updates is active, with works on characterization, understanding, automatic repair of breaking updates, and other software engineering aspects. All such research projects require a benchmark of breaking updates that has the following properties: 1) it contains real-world breaking updates; 2) the breaking updates can be executed; 3) the benchmark provides stable scientific artifacts of breaking updates over time, a property we call reproducibility. To the best of our knowledge, such a benchmark is missing. To address this problem, we present BUMP, a new benchmark that contains reproducible breaking dependency updates in the context of Java projects built with the Maven build system. BUMP contains 571 breaking dependency updates collected from 153 Java projects. BUMP ensures long-term reproducibility of dependency updates on different platforms, guaranteeing consistent build failures. We categorize the different causes of build breakage in BUMP, providing novel insights for future work on breaking update engineering. To our knowledge, BUMP is the first of its kind, providing hundreds of real-world breaking updates that have all been made reproducible.
Paper Structure (23 sections, 1 equation, 4 figures, 5 tables)

This paper contains 23 sections, 1 equation, 4 figures, 5 tables.

Figures (4)

  • Figure 1: A real-world example of a breaking update.
  • Figure 2: Overview of the methodology to build the BUMP benchmark.
  • Figure 3: Overview of the process to map the API changes with the client project build errors
  • Figure 4: Distribution of dependencies per project over BUMP

Theorems & Definitions (5)

  • Definition 1
  • Definition 2
  • Definition 3
  • Definition 4
  • Definition 5