Practitioners' Challenges and Perceptions of CI Build Failure Predictions at Atlassian
Yang Hong, Chakkrit Tantithamthavorn, Jirat Pasuksmit, Patanamon Thongtanunam, Arik Friedman, Xing Zhao, Anton Krasikov
TL;DR
This study investigates CI build failures at Atlassian and evaluates CI build prediction within Bitbucket through an empirical mix of large-scale internal data analysis and practitioner surveys. An analysis of $350{,}037$ PRs across $1{,}630$ projects yields an $AUC$ of $0.82$ for a logistic regression model, with repository-history signals identified as the strongest predictors. Qualitative insights from 53 practitioners reveal that predictions offer proactive value but raise concerns about accuracy and over-reliance, underscoring the need for context-aware explanations. The work provides industry-grounded guidance for integrating CI build failure predictions into CI workflows, emphasizing human-in-the-loop design and explainability to enhance adoption and impact.
Abstract
Continuous Integration (CI) build failures could significantly impact the software development process and teams, such as delaying the release of new features and reducing developers' productivity. In this work, we report on an empirical study that investigates CI build failures throughout product development at Atlassian. Our quantitative analysis found that the repository dimension is the key factor influencing CI build failures. In addition, our qualitative survey revealed that Atlassian developers perceive CI build failures as challenging issues in practice. Furthermore, we found that the CI build prediction can not only provide proactive insight into CI build failures but also facilitate the team's decision-making. Our study sheds light on the challenges and expectations involved in integrating CI build prediction tools into the Bitbucket environment, providing valuable insights for enhancing CI processes.
