Table of Contents
Fetching ...

Random Generation of Git Graphs

Julien Courtiel, Martin Pépin

TL;DR

This work focuses on the uniform random generation of those graphs with n vertices, including k on the main branch, for which it provides three algorithms, for three different use-cases, which enable very large graphs while targeting a constant k/n ratio.

Abstract

Version Control Systems, such as Git and Mercurial, manage the history of a project as a Directed Acyclic Graph encoding the various divergences and synchronizations happening in its life cycle. A popular workflow in the industry, called the feature branch workflow, constrains these graphs to be of a particular shape: a unique main branch, and non-interfering feature branches. Here we focus on the uniform random generation of those graphs with n vertices, including k on the main branch, for which we provide three algorithms, for three different use-cases. The first, based on rejection, is efficient when aiming for small values of k (more precisely whenever k = O($\sqrt$ n)). The second takes as input any number k of commits in the main branch, but requires costly precalculation. The last one is a Boltzmann generator and enables us to generate very large graphs while targeting a constant k/n ratio. All these algorithms are linear in the size of their outputs.

Random Generation of Git Graphs

TL;DR

This work focuses on the uniform random generation of those graphs with n vertices, including k on the main branch, for which it provides three algorithms, for three different use-cases, which enable very large graphs while targeting a constant k/n ratio.

Abstract

Version Control Systems, such as Git and Mercurial, manage the history of a project as a Directed Acyclic Graph encoding the various divergences and synchronizations happening in its life cycle. A popular workflow in the industry, called the feature branch workflow, constrains these graphs to be of a particular shape: a unique main branch, and non-interfering feature branches. Here we focus on the uniform random generation of those graphs with n vertices, including k on the main branch, for which we provide three algorithms, for three different use-cases. The first, based on rejection, is efficient when aiming for small values of k (more precisely whenever k = O( n)). The second takes as input any number k of commits in the main branch, but requires costly precalculation. The last one is a Boltzmann generator and enables us to generate very large graphs while targeting a constant k/n ratio. All these algorithms are linear in the size of their outputs.
Paper Structure (10 sections, 5 theorems, 8 equations, 4 figures, 3 algorithms)

This paper contains 10 sections, 5 theorems, 8 equations, 4 figures, 3 algorithms.

Key Result

Proposition 1

Let $u$ be any real positive number. Consider $\gamma_n$ a random Git graph of size $n$ taken with probabi-lity $\dfrac{u^{k(\gamma_n)}}{\sum_{\gamma \text{ Git graph\ of size }n} u^{k(\gamma)}}$. Then the random variable $\frac{k(\gamma_n)}{n}$ converges in probability to $\frac{1}{2}$ when $n$ goe

Figures (4)

  • Figure 1: All Git graphs with $5$ vertices including $3$ black vertices. Edges are oriented from left to right. Free vertices are outlined in orange.
  • Figure 2: How to decompose a Git graph.
  • Figure 3: Outline of the bijection between Git graphs and cyclariums
  • Figure 4: Illustration of the first steps of Algorithm \ref{['algo:boltz2']}.

Theorems & Definitions (7)

  • Definition 1: Git graph
  • Proposition 1
  • Definition 2
  • Theorem 1
  • Proposition 2
  • Proposition 3
  • Corollary 1