Table of Contents
Fetching ...

Asymptotic analysis and efficient random sampling of directed ordered acyclic graphs

Martin Pépin, Alfredo Viola

TL;DR

This paper provides efficient algorithms for sampling objects of this new class of directed acyclic DAGs, both with or without control on the number of edges, and obtains an asymptotic equivalent of their number.

Abstract

Directed acyclic graphs (DAGs) are directed graphs in which there is no path from a vertex to itself. DAGs are an omnipresent data structure in computer science and the problem of counting the DAGs of given number of vertices and to sample them uniformly at random has been solved respectively in the 70's and the 00's. In this paper, we propose to explore a new variation of this model where DAGs are endowed with an independent ordering of the out-edges of each vertex, thus allowing to model a wide range of existing data structures. We provide efficient algorithms for sampling objects of this new class, both with or without control on the number of edges, and obtain an asymptotic equivalent of their number. We also show the applicability of our method by providing an effective algorithm for the random generation of classical labelled DAGs with a prescribed number of vertices and edges, based on a similar approach. This is the first known algorithm for sampling labelled DAGs with full control on the number of edges, and it meets a need in terms of applications, that had already been acknowledged in the literature.

Asymptotic analysis and efficient random sampling of directed ordered acyclic graphs

TL;DR

This paper provides efficient algorithms for sampling objects of this new class of directed acyclic DAGs, both with or without control on the number of edges, and obtains an asymptotic equivalent of their number.

Abstract

Directed acyclic graphs (DAGs) are directed graphs in which there is no path from a vertex to itself. DAGs are an omnipresent data structure in computer science and the problem of counting the DAGs of given number of vertices and to sample them uniformly at random has been solved respectively in the 70's and the 00's. In this paper, we propose to explore a new variation of this model where DAGs are endowed with an independent ordering of the out-edges of each vertex, thus allowing to model a wide range of existing data structures. We provide efficient algorithms for sampling objects of this new class, both with or without control on the number of edges, and obtain an asymptotic equivalent of their number. We also show the applicability of our method by providing an effective algorithm for the random generation of classical labelled DAGs with a prescribed number of vertices and edges, based on a similar approach. This is the first known algorithm for sampling labelled DAGs with full control on the number of edges, and it meets a need in terms of applications, that had already been acknowledged in the literature.
Paper Structure (35 sections, 15 theorems, 54 equations, 16 figures, 2 tables, 6 algorithms)

This paper contains 35 sections, 15 theorems, 54 equations, 16 figures, 2 tables, 6 algorithms.

Key Result

Theorem 1

Let $N, M > 0$ be two integers. And let $\mathcal{P}$ be a subset of $\mathbb N$ such that $\mathcal{P} \cap \llbracket 0; n\rrbracket$ can be enumerated in linear time in $n$. Computing $D_{n, m, k}^{\mathcal{P}}$ for all $n \leq N$, all $m \leq M$, and all possible $k$ can be done with $O(N^4 M)$

Figures (16)

  • Figure 1: All DOAGs with respectly $1$ vertex, $2$ vertices, $3$ vertices, and simultaneously $4$ vertices and $3$ edges. All edges are implicitly oriented from top to bottom, the blue labels and arrows represent the sources and out-edges orderings (always from left to right).
  • Figure 2: The two first steps of the recursive decomposition of a DOAG by removing sources one by one in a breadth first search (BFS) fashion. The edges are implicitly oriented from top to bottom and the order of the outgoing edges of each vertex is indicated by the thinner blue arrows (always from left to right here). The integer labels at each stage indicate the ordering of the sources. The big disk, square, and triangle are only here to highlight particular vertices involved with the functions $f$ in the decomposition.
  • Figure 3: The graphical representation of a (truncated) Git history and four bounded-degree random DOAGs.
  • Figure 4: An example DOAG and its labelled transition matrix, the zeros are represented by the absence of a number.
  • Figure 5: An example of a matrix of variations that cannot be obtained as a labelled transition matrix of a DOAG. The labelled DOAG that it encodes is not labelled according to the decomposition order.
  • ...and 11 more figures

Theorems & Definitions (30)

  • Definition 1: Directed Ordered Graph
  • Definition 2: Directed Ordered Acyclic Graph
  • Theorem 1
  • Theorem 2
  • Definition 3: Labelled transition matrix of a DOAG
  • Definition 4: Variation
  • Definition 5: Variation matrix
  • Theorem 3
  • proof
  • Theorem 4
  • ...and 20 more