Table of Contents
Fetching ...

Does Using Bazel Help Speed Up Continuous Integration Builds?

Shenyu Zheng, Bram Adams, Ahmed E. Hassan

TL;DR

The paper tackles the problem of long CI build times and investigates whether Bazel's artifact-based, parallel, and incremental build features deliver the promised performance benefits in real-world CI contexts. It employs a large-scale empirical methodology, collecting 383 Bazel projects and 4,727 Maven projects, analyzing CI configurations across four services, and conducting thousands of controlled experiments (3,500 parallelization runs and 102,232 cache experiments) to quantify impact. Key findings show substantial speedups for long-build duration projects with increased parallelism (up to 12.80x at 16 workers) and meaningful gains from incremental builds (median ~4x with caches), but also reveal underutilization in CI practices and variable cache effectiveness, especially for short to medium builds. The work provides practical guidance for developers on when and how to apply Bazel’s parallel and incremental features in CI, highlights considerations for cache strategies, and points to future research on distributed builds and CI testing to further improve build efficiency and reliability.

Abstract

A long continuous integration (CI) build forces developers to wait for CI feedback before starting subsequent development activities, leading to time wasted. In addition to a variety of build scheduling and test selection heuristics studied in the past, new artifact-based build technologies like Bazel have built-in support for advanced performance optimizations such as parallel build and incremental build (caching of build results). However, little is known about the extent to which new build technologies like Bazel deliver on their promised benefits, especially for long-build duration projects. In this study, we collected 383 Bazel projects from GitHub, then studied their parallel and incremental build usage of Bazel in 4 popular CI services, and compared the results with Maven projects. We conducted 3,500 experiments on 383 Bazel projects and analyzed the build logs of a subset of 70 buildable projects to evaluate the performance impact of Bazel's parallel builds. Additionally, we performed 102,232 experiments on the 70 buildable projects' last 100 commits to evaluate Bazel's incremental build performance. Our results show that 31.23% of Bazel projects adopt a CI service but do not use Bazel in the CI service, while for those who do use Bazel in CI, 27.76% of them use other tools to facilitate Bazel's execution. Compared to sequential builds, the median speedups for long-build duration projects are 2.00x, 3.84x, 7.36x, and 12.80x, at parallelism degrees 2, 4, 8, and 16, respectively, even though, compared to a clean build, applying incremental build achieves a median speedup of 4.22x (with a build system tool-independent CI cache) and 4.71x (with a build system tool-specific cache) for long-build duration projects. Our results provide guidance for developers to improve the usage of Bazel in their projects.

Does Using Bazel Help Speed Up Continuous Integration Builds?

TL;DR

The paper tackles the problem of long CI build times and investigates whether Bazel's artifact-based, parallel, and incremental build features deliver the promised performance benefits in real-world CI contexts. It employs a large-scale empirical methodology, collecting 383 Bazel projects and 4,727 Maven projects, analyzing CI configurations across four services, and conducting thousands of controlled experiments (3,500 parallelization runs and 102,232 cache experiments) to quantify impact. Key findings show substantial speedups for long-build duration projects with increased parallelism (up to 12.80x at 16 workers) and meaningful gains from incremental builds (median ~4x with caches), but also reveal underutilization in CI practices and variable cache effectiveness, especially for short to medium builds. The work provides practical guidance for developers on when and how to apply Bazel’s parallel and incremental features in CI, highlights considerations for cache strategies, and points to future research on distributed builds and CI testing to further improve build efficiency and reliability.

Abstract

A long continuous integration (CI) build forces developers to wait for CI feedback before starting subsequent development activities, leading to time wasted. In addition to a variety of build scheduling and test selection heuristics studied in the past, new artifact-based build technologies like Bazel have built-in support for advanced performance optimizations such as parallel build and incremental build (caching of build results). However, little is known about the extent to which new build technologies like Bazel deliver on their promised benefits, especially for long-build duration projects. In this study, we collected 383 Bazel projects from GitHub, then studied their parallel and incremental build usage of Bazel in 4 popular CI services, and compared the results with Maven projects. We conducted 3,500 experiments on 383 Bazel projects and analyzed the build logs of a subset of 70 buildable projects to evaluate the performance impact of Bazel's parallel builds. Additionally, we performed 102,232 experiments on the 70 buildable projects' last 100 commits to evaluate Bazel's incremental build performance. Our results show that 31.23% of Bazel projects adopt a CI service but do not use Bazel in the CI service, while for those who do use Bazel in CI, 27.76% of them use other tools to facilitate Bazel's execution. Compared to sequential builds, the median speedups for long-build duration projects are 2.00x, 3.84x, 7.36x, and 12.80x, at parallelism degrees 2, 4, 8, and 16, respectively, even though, compared to a clean build, applying incremental build achieves a median speedup of 4.22x (with a build system tool-independent CI cache) and 4.71x (with a build system tool-specific cache) for long-build duration projects. Our results provide guidance for developers to improve the usage of Bazel in their projects.
Paper Structure (30 sections, 17 figures, 11 tables)

This paper contains 30 sections, 17 figures, 11 tables.

Figures (17)

  • Figure 1: The phases in Maven default lifecycle (The phases with darker backgrounds are the most commonly used ones).
  • Figure 2: An example Maven pom file
  • Figure 3: An example of the Bazel build configuration file (https://github.com/bazelbuild/examples)
  • Figure 4: (a) The DAG of the build process with parallel build. (b) The DAG of the build process with incremental build. Each node in the DAG represents a compilation unit in the build.
  • Figure 5: The process of data collection
  • ...and 12 more figures