Table of Contents
Fetching ...

The Theory and Practice of Computing the Bus-Factor

Sebastiano A. Piccolo, Pasquale De Meo, Giorgio Terracina, Gianluigi Greco

TL;DR

This paper develops a unified, domain-agnostic framework for bus-factor estimation by modeling projects as bipartite graphs of people and tasks and casting the computation of the bus-factor as a family of combinatorial optimization problems and introduces a novel bus-factor measure inspired by network robustness.

Abstract

The bus-factor is a measure of project risk with respect to personnel availability, informally defined as the number of people whose sudden unavailability would cause a project to stall or experience severe delays. Despite its intuitive appeal, existing bus-factor measures rely on heterogeneous modeling assumptions, ambiguous definitions of failure, and domain-specific artifacts, limiting their generality, comparability, and ability to capture project fragmentation. In this paper, we develop a unified, domain-agnostic framework for bus-factor estimation by modeling projects as bipartite graphs of people and tasks and casting the computation of the bus-factor as a family of combinatorial optimization problems. Within this framework, we formalize and reconcile two complementary interpretations of the bus-factor, redundancy and criticality, corresponding to the Maximum Redundant Set and the Minimum Critical Set, respectively, and prove that both formulations are NP-hard. Building on this theoretical foundation, we introduce a novel bus-factor measure inspired by network robustness. Unlike prior approaches, the proposed measure captures both loss of coverage and increasing project fragmentation by tracking the largest connected set of tasks under progressive contributor removal. The resulting measure is normalized, threshold-free, and applicable across domains; we show that its exact computation is NP-hard as well. We further propose efficient linear-time approximation algorithms for all considered measures. Finally, we evaluate their behavior through a sensitivity analysis based on controlled perturbations of project structures, guided by expectations derived from project management theory. Our results show that the robustness-based measure behaves consistently with these expectations and provides a more informative and stable assessment of project risk than existing alternatives.

The Theory and Practice of Computing the Bus-Factor

TL;DR

This paper develops a unified, domain-agnostic framework for bus-factor estimation by modeling projects as bipartite graphs of people and tasks and casting the computation of the bus-factor as a family of combinatorial optimization problems and introduces a novel bus-factor measure inspired by network robustness.

Abstract

The bus-factor is a measure of project risk with respect to personnel availability, informally defined as the number of people whose sudden unavailability would cause a project to stall or experience severe delays. Despite its intuitive appeal, existing bus-factor measures rely on heterogeneous modeling assumptions, ambiguous definitions of failure, and domain-specific artifacts, limiting their generality, comparability, and ability to capture project fragmentation. In this paper, we develop a unified, domain-agnostic framework for bus-factor estimation by modeling projects as bipartite graphs of people and tasks and casting the computation of the bus-factor as a family of combinatorial optimization problems. Within this framework, we formalize and reconcile two complementary interpretations of the bus-factor, redundancy and criticality, corresponding to the Maximum Redundant Set and the Minimum Critical Set, respectively, and prove that both formulations are NP-hard. Building on this theoretical foundation, we introduce a novel bus-factor measure inspired by network robustness. Unlike prior approaches, the proposed measure captures both loss of coverage and increasing project fragmentation by tracking the largest connected set of tasks under progressive contributor removal. The resulting measure is normalized, threshold-free, and applicable across domains; we show that its exact computation is NP-hard as well. We further propose efficient linear-time approximation algorithms for all considered measures. Finally, we evaluate their behavior through a sensitivity analysis based on controlled perturbations of project structures, guided by expectations derived from project management theory. Our results show that the robustness-based measure behaves consistently with these expectations and provides a more informative and stable assessment of project risk than existing alternatives.
Paper Structure (26 sections, 8 theorems, 12 equations, 6 figures, 3 algorithms)

This paper contains 26 sections, 8 theorems, 12 equations, 6 figures, 3 algorithms.

Key Result

Proposition 1

$Z_{\min,t}(G) = MCS_{1-t}(G) - 1$, for all $0 < t < 1$.

Figures (6)

  • Figure 1: Visual comparison of the measures to estimate a project bus-factor. Consider the toy project in A). When $p_1$ leaves, the project fragments into four disconnected components. Under further removals, tasks quickly become critically dependent on single contributors. B) Maximum Redundant Set (MRS) estimates a bus-factor of 8 by removing redundant contributors. C) Minimum Critical Set (MCS) estimates a bus-factor of 7 by applying a coverage threshold, ignoring project topology. D) Our robustness-based measure captures fragmentation by evaluating the largest number of tasks that remain connected in a single component as people leave; yielding a normalized, threshold-free bus-factor consistent with project intuition.
  • Figure 2: Visualizations of the NP-hardness reductions. A) Reduction from Clique to Minimum Critical Set (solutions highlighted): nodes in the original graph become the left vertex set ($P$) in the bipartite construction, and edges become the right vertex set ($T$). Edges in the bipartite graph represent the incidences of the original graph. A clique in the original graph corresponds to a minimum critical set in the bipartite graph. B) Reduction from Set Cover to Partial Set Cover for the Maximum Redundant Set proof: an instance of Set Cover, represented here as a bipartite graph with its solution highlighted, is transformed into an instance of Partial Set Cover by adding an appropriate number of bipartite vertex pairs.
  • Figure 3: Algorithmic performance analysis. A) Running time comparison. B) Sensitivity of MRS to the threshold $t$. C) Sensitivity of MCS to the threshold $t$. D) Average relative size of the MRS for different processing strategies. E) Average relative size of the MCS for different removal orders $(\pi)$. F) Average robustness $\mathcal{B}(G,\pi)$ for different removal orders $(\pi)$. MRS: Maximum Redundant Set; MCS: Minimum Critical Set; Robustness: Bus-factor as Bipartite Network Robustness.
  • Figure 4: Sensitivity of bus-factor measures to changes in network density (Q1). A) Densification: MCS and Robustness increase with network density, as expected. Inset: values of MCS and Robustness expressed as number of people. B) Sparsification: MCS and Robustness decrease with network density, as expected. Inset: a zoomed-in view of the range $[5\,000, 10\,000]$ edges removed, showing that Robustness is more stable than MCS. MRS: Maximum Redundant Set; MCS: Minimum Critical Set; Robustness: Bus-Factor as Bipartite Network Robustness.
  • Figure 5: Sensitivity of bus-factor measures to strategies aimed at increasing personnel redundancy (Q2). A) Sensitivity to singletons (i.e., people who work on only one task). MCS and MRS increase as singletons are added to the network and do not exhibit an upper bound. This behavior is undesirable for a bus-factor measure. Robustness, instead, correctly captures the diminishing returns of adding singletons and decreases as singletons are added to the network. In the inset: values of MCS and Robustness expressed as numbers of people. While normalized Robustness decreases, reflecting diminishing returns, the number of people required to fragment the network remains constant, as expected. B) Sensitivity to duplicates. MRS grows indefinitely without an upper bound. MCS grows as duplicates are added until it saturates due to its threshold. Robustness, in line with expectations, captures the diminishing returns of adding people with progressively lower degree, exhibiting behavior consistent with empirical findings on the importance of integrators. MRS: Maximum Redundant Set; MCS: Minimum Critical Set; Robustness: Bus-Factor as Bipartite Network Robustness.
  • ...and 1 more figures

Theorems & Definitions (17)

  • Proposition 1
  • proof
  • Theorem 1
  • proof
  • Theorem 2
  • proof
  • Corollary 1
  • proof
  • Definition 1: Backbone set
  • Theorem 3
  • ...and 7 more