Table of Contents
Fetching ...

Enumerative and Distributional Results for $d$-combining Tree-Child Networks

Yu-Sheng Chang, Michael Fuchs, Hexuan Liu, Michael Wallner, Guan-Ru Yu

TL;DR

The distributional behavior of shape parameters of a network which is drawn uniformly at random from the set of all tree-child networks with the same number of leaves is given, leading to normal, Bessel, Poisson, and degenerate distributions.

Abstract

Tree-child networks are one of the most prominent network classes for modeling evolutionary processes which contain reticulation events. Several recent studies have addressed counting questions for bicombining tree-child networks in which every reticulation node has exactly two parents. We extend these studies to $d$-combining tree-child networks where every reticulation node has now $d\geq 2$ parents, and we study one-component as well as general tree-child networks. For the number of one-component networks, we derive an exact formula from which asymptotic results follow that contain a stretched exponential for $d=2$, yet not for $d \geq 3$. For general networks, we find a novel encoding by words which leads to a recurrence for their numbers. From this recurrence, we derive asymptotic results which show the appearance of a stretched exponential for all $d \geq 2$. Moreover, we also give results on the distribution of shape parameters (e.g., number of reticulation nodes, Sackin index) of a network which is drawn uniformly at random from the set of all tree-child networks with the same number of leaves. We show phase transitions depending on $d$, leading to normal, Bessel, Poisson, and degenerate distributions. Some of our results are new even in the bicombining case.

Enumerative and Distributional Results for $d$-combining Tree-Child Networks

TL;DR

The distributional behavior of shape parameters of a network which is drawn uniformly at random from the set of all tree-child networks with the same number of leaves is given, leading to normal, Bessel, Poisson, and degenerate distributions.

Abstract

Tree-child networks are one of the most prominent network classes for modeling evolutionary processes which contain reticulation events. Several recent studies have addressed counting questions for bicombining tree-child networks in which every reticulation node has exactly two parents. We extend these studies to -combining tree-child networks where every reticulation node has now parents, and we study one-component as well as general tree-child networks. For the number of one-component networks, we derive an exact formula from which asymptotic results follow that contain a stretched exponential for , yet not for . For general networks, we find a novel encoding by words which leads to a recurrence for their numbers. From this recurrence, we derive asymptotic results which show the appearance of a stretched exponential for all . Moreover, we also give results on the distribution of shape parameters (e.g., number of reticulation nodes, Sackin index) of a network which is drawn uniformly at random from the set of all tree-child networks with the same number of leaves. We show phase transitions depending on , leading to normal, Bessel, Poisson, and degenerate distributions. Some of our results are new even in the bicombining case.
Paper Structure (20 sections, 36 theorems, 174 equations, 12 figures, 6 tables)

This paper contains 20 sections, 36 theorems, 174 equations, 12 figures, 6 tables.

Key Result

Theorem 1.5

The following asymptotic equivalences hold for one-component $d$-combining tree-child networks.

Figures (12)

  • Figure 1: (a) A $3$-combining phylogenetic network which is not a tree-child network (because both children of the tree node $x$ are reticulation nodes and the only child of the reticulation node $y$ is also a reticulation node); (b) a $3$-combining tree-child network; (c) a $3$-combining one-component tree-child network.
  • Figure 2: Construction of $\mathcal{OC}_{3,1}^{(2)}$ in Remark \ref{['rem:PathLengthToptree']} ($K=2$, $M=3$). (Top, left) The only phylogenetic tree with $2$ leaves; path lengths below each node and total path length $5$. (Bottom) $\binom{4}{2}=6$ top trees $\mathcal{OC}_{2,1}^{(2)}$ created from it after adding $2$ unary nodes and the respective "balls and bars" diagrams. (Top, right) Superposition of all $6$ top trees.
  • Figure 3: Enumeration of $P(\mathcal{OC}_{3,1}^{(2)})$ in Remark \ref{['rem:PathLengthToptree']} using superposition ($K=2$, $M=3$); see Figure \ref{['fig:Sackin-OTC']}. (Left) Step 1: Each new instance increases edge weight by one; total $\binom{4}{2}=6$. (Middle) Step 2: Correct path length of original nodes; sum unary nodes per edge. (Right) Step 3: Add path length for unary nodes; for $i$ nodes weight $1+2+\dots+i$. (Bottom) "Balls and bars" corresponding to each step; the two added bars are shown bold.
  • Figure 4: A $3$-combining tree-child network with $5$ leaves and $2$ reticulation nodes (gray nodes). Thus, the number of free tree nodes is $2$ (red nodes) and there are $4$ free edges (red edges).
  • Figure 5: (a) The network from Figure \ref{['3-comb-ex']} together with the $4$ possible ways of choosing an outgoing free edge for every free tree node; (b) Replacing each free tree node by a reticulation node which results in a maximally reticulated tree-child network whose path-components are indexed; (c) Labeling all internal nodes by labeling reticulation nodes and their parents with the label of their path-component. Note that all nodes on the chosen free edges only receive one label; (d) The word from $\mathcal{C}_{4,2}^{(3)}$ and the permutation corresponding to each network.
  • ...and 7 more figures

Theorems & Definitions (84)

  • Definition 1.1: Phylogenetic network
  • Definition 1.2: $d$-combining network
  • Definition 1.3: Tree-child network
  • Definition 1.4: One-component tree-child network
  • Theorem 1.5
  • Theorem 1.6
  • Remark 1.7
  • Theorem 1.8
  • Theorem 1.9
  • Corollary 1.10
  • ...and 74 more