Table of Contents
Fetching ...

Bus Factor Explorer

Egor Klimov, Muhammad Umair Ahmed, Nikolai Sviridov, Pouria Derakhshanfar, Eray Tüzün, Vladimir Kovalenko

TL;DR

The paper tackles the risk of uneven knowledge distribution in collaborative software projects by leveraging the Bus Factor (BF) metric. It introduces Bus Factor Explorer, a Dockerized web application that computes BF for GitHub repositories using a DOA-based algorithm, visualizes results with a treemap, offers a simulation mode to assess knowledge loss scenarios, and exports data for further analysis. The tool demonstrates scalability by analyzing 935 popular GitHub repositories, with reported linear time scaling relative to the number of commits and substantial performance efficiency (e.g., handling large commit histories in tens of seconds). It also provides an API and interactive UI for both practitioners and researchers, enabling repository search, data exploration, and chart-based visualizations, while planning UX studies and future enhancements such as broader VCS host support. Overall, Bus Factor Explorer offers a practical, extensible solution for identifying at-risk files and subsystems due to developer turnover and supports future research in BF calculation and visualization.

Abstract

Bus factor (BF) is a metric that tracks knowledge distribution in a project. It is the minimal number of engineers that have to leave for a project to stall. Despite the fact that there are several algorithms for calculating the bus factor, only a few tools allow easy calculation of bus factor and convenient analysis of results for projects hosted on Git-based providers. We introduce Bus Factor Explorer, a web application that provides an interface and an API to compute, export, and explore the Bus Factor metric via treemap visualization, simulation mode, and chart editor. It supports repositories hosted on GitHub and enables functionality to search repositories in the interface and process many repositories at the same time. Our tool allows users to identify the files and subsystems at risk of stalling in the event of developer turnover by analyzing the VCS history. The application and its source code are publicly available on GitHub at https://github.com/JetBrains-Research/bus-factor-explorer. The demonstration video can be found on YouTube: https://youtu.be/uIoV79N14z8

Bus Factor Explorer

TL;DR

The paper tackles the risk of uneven knowledge distribution in collaborative software projects by leveraging the Bus Factor (BF) metric. It introduces Bus Factor Explorer, a Dockerized web application that computes BF for GitHub repositories using a DOA-based algorithm, visualizes results with a treemap, offers a simulation mode to assess knowledge loss scenarios, and exports data for further analysis. The tool demonstrates scalability by analyzing 935 popular GitHub repositories, with reported linear time scaling relative to the number of commits and substantial performance efficiency (e.g., handling large commit histories in tens of seconds). It also provides an API and interactive UI for both practitioners and researchers, enabling repository search, data exploration, and chart-based visualizations, while planning UX studies and future enhancements such as broader VCS host support. Overall, Bus Factor Explorer offers a practical, extensible solution for identifying at-risk files and subsystems due to developer turnover and supports future research in BF calculation and visualization.

Abstract

Bus factor (BF) is a metric that tracks knowledge distribution in a project. It is the minimal number of engineers that have to leave for a project to stall. Despite the fact that there are several algorithms for calculating the bus factor, only a few tools allow easy calculation of bus factor and convenient analysis of results for projects hosted on Git-based providers. We introduce Bus Factor Explorer, a web application that provides an interface and an API to compute, export, and explore the Bus Factor metric via treemap visualization, simulation mode, and chart editor. It supports repositories hosted on GitHub and enables functionality to search repositories in the interface and process many repositories at the same time. Our tool allows users to identify the files and subsystems at risk of stalling in the event of developer turnover by analyzing the VCS history. The application and its source code are publicly available on GitHub at https://github.com/JetBrains-Research/bus-factor-explorer. The demonstration video can be found on YouTube: https://youtu.be/uIoV79N14z8
Paper Structure (16 sections, 5 figures, 1 table)

This paper contains 16 sections, 5 figures, 1 table.

Figures (5)

  • Figure 1: Treemap report for the cpython repository
  • Figure 2: Simulation mode for the Linux Kernel repository
  • Figure 3: An overview of the tool workflow. The tool iterates commits for the last 1.5 years since the last commit and collects information about file ownership. Next, it builds file tree for the repository. Then, the tree is enriched by the bus factor data.
  • Figure 4: Treemap report for cpython repository built with interactive chart editor
  • Figure 5: Discovered dependency of repository analysis time on the number of commits