Table of Contents
Fetching ...

Detecting Continuous Integration Skip : A Reinforcement Learning-based Approach

Hajer Mhalla, Mohamed Aymen Saied

TL;DR

This work tackles the problem of identifying CI commits that should be skipped to save computational resources while preserving CI benefits. It proposes a Deep Reinforcement Learning–driven method to construct an optimal, interpretable Decision Tree for CI-skip detection, addressing data imbalance without sacrificing explainability. Through within-project and cross-project evaluations on Travis CI data and an extension to GitHub Actions workflow features, the approach outperforms state-of-the-art baselines and demonstrates robust generalization. The findings highlight the practical value of RL-guided tree learning and point to future directions combining ensemble methods and broader CI platforms.

Abstract

The software industry is experiencing a surge in the adoption of Continuous Integration (CI) practices, both in commercial and open-source environments. CI practices facilitate the seamless integration of code changes by employing automated building and testing processes. Some frameworks, such as Travis CI and GitHub Actions have significantly contributed to simplifying and enhancing the CI process, rendering it more accessible and efficient for development teams. Despite the availability these CI tools , developers continue to encounter difficulties in accurately flagging commits as either suitable for CI execution or as candidates for skipping especially for large projects with many dependencies. Inaccurate flagging of commits can lead to resource-intensive test and build processes, as even minor commits may inadvertently trigger the Continuous Integration process. The problem of detecting CI-skip commits, can be modeled as binary classification task where we decide to either build a commit or to skip it. This study proposes a novel solution that leverages Deep Reinforcement Learning techniques to construct an optimal Decision Tree classifier that addresses the imbalanced nature of the data. We evaluate our solution by running a within and a cross project validation benchmark on diverse range of Open-Source projects hosted on GitHub which showcased superior results when compared with existing state-of-the-art methods.

Detecting Continuous Integration Skip : A Reinforcement Learning-based Approach

TL;DR

This work tackles the problem of identifying CI commits that should be skipped to save computational resources while preserving CI benefits. It proposes a Deep Reinforcement Learning–driven method to construct an optimal, interpretable Decision Tree for CI-skip detection, addressing data imbalance without sacrificing explainability. Through within-project and cross-project evaluations on Travis CI data and an extension to GitHub Actions workflow features, the approach outperforms state-of-the-art baselines and demonstrates robust generalization. The findings highlight the practical value of RL-guided tree learning and point to future directions combining ensemble methods and broader CI platforms.

Abstract

The software industry is experiencing a surge in the adoption of Continuous Integration (CI) practices, both in commercial and open-source environments. CI practices facilitate the seamless integration of code changes by employing automated building and testing processes. Some frameworks, such as Travis CI and GitHub Actions have significantly contributed to simplifying and enhancing the CI process, rendering it more accessible and efficient for development teams. Despite the availability these CI tools , developers continue to encounter difficulties in accurately flagging commits as either suitable for CI execution or as candidates for skipping especially for large projects with many dependencies. Inaccurate flagging of commits can lead to resource-intensive test and build processes, as even minor commits may inadvertently trigger the Continuous Integration process. The problem of detecting CI-skip commits, can be modeled as binary classification task where we decide to either build a commit or to skip it. This study proposes a novel solution that leverages Deep Reinforcement Learning techniques to construct an optimal Decision Tree classifier that addresses the imbalanced nature of the data. We evaluate our solution by running a within and a cross project validation benchmark on diverse range of Open-Source projects hosted on GitHub which showcased superior results when compared with existing state-of-the-art methods.
Paper Structure (22 sections, 14 equations, 4 figures, 8 tables)

This paper contains 22 sections, 14 equations, 4 figures, 8 tables.

Figures (4)

  • Figure 1: The workflow of the proposed approach applied to a commit sampled from our dataset
  • Figure 2: An example of a CI-skip decision tree
  • Figure 3: Agent-Environment Interaction in Deep Q-Learning
  • Figure 4: The Framework of the proposed solution