Table of Contents
Fetching ...

Identifiability of Level-1 Species Networks from Gene Tree Quartets

Elizabeth S. Allman, Hector Baños, Marina Garrote-Lopez, John A. Rhodes

TL;DR

This work investigates what level-1 network features are identifiable from concordance factors under the network multispecies coalescent model, and obtains results on both topological features of the network, and numerical parameters, uncovering a number of failures of identifiability related to 3-cycles in the network.

Abstract

When hybridization or other forms of lateral gene transfer have occurred, evolutionary relationships of species are better represented by phylogenetic networks than by trees. While inference of such networks remains challenging, several recently proposed methods are based on quartet concordance factors -- the probabilities that a tree relating a gene sampled from the species displays the possible 4-taxon relationships. Building on earlier results, we investigate what level-1 network features are identifiable from concordance factors under the network multispecies coalescent model. We obtain results on both topological features of the network, and numerical parameters, uncovering a number of failures of identifiability related to 3-cycles in the network.

Identifiability of Level-1 Species Networks from Gene Tree Quartets

TL;DR

This work investigates what level-1 network features are identifiable from concordance factors under the network multispecies coalescent model, and obtains results on both topological features of the network, and numerical parameters, uncovering a number of failures of identifiability related to 3-cycles in the network.

Abstract

When hybridization or other forms of lateral gene transfer have occurred, evolutionary relationships of species are better represented by phylogenetic networks than by trees. While inference of such networks remains challenging, several recently proposed methods are based on quartet concordance factors -- the probabilities that a tree relating a gene sampled from the species displays the possible 4-taxon relationships. Building on earlier results, we investigate what level-1 network features are identifiable from concordance factors under the network multispecies coalescent model. We obtain results on both topological features of the network, and numerical parameters, uncovering a number of failures of identifiability related to 3-cycles in the network.
Paper Structure (27 sections, 40 theorems, 44 equations, 16 figures)

This paper contains 27 sections, 40 theorems, 44 equations, 16 figures.

Key Result

Lemma 3.2

\newlabellem:root0 Under the NMSC on a level-1 network $N^+$ the values of the quartet $CF$s depend only on the induced semidirected network $N$.

Figures (16)

  • Figure 1: (L) A rooted network $N^ +$ on $X$ with root $r = \text{LSA}(X)$, and (R) The unrooted network $N^-$ obtained from $N^ +$.
  • Figure 1: Networks with 3-cycles inducing $(1,1,n-2)$ partitions. The shaded triangle represents an arbitrary semidirected subnetwork. (L,R) correspond to cases (1,2) of \ref{['prop:3-cyc11']}.
  • Figure 1: A semidirected network with edges defined by sets $Q$ of 4 taxa highlighted in blue.
  • Figure 1: (L) A network with a 4-cycle of interest at bottom, and (R) a network with a single cycle obtained by removing a hybrid edge from each pair not in the cycle of interest.
  • Figure 2: Several semidirected quartet networks induced from the network in \ref{['fig::net']}.
  • ...and 11 more figures

Theorems & Definitions (67)

  • Definition 2.1
  • Definition 2.2
  • Definition 3.1
  • Lemma 3.2
  • Theorem 3.3
  • Lemma 4.1
  • Corollary 4.2
  • Proposition 4.3
  • Proof 1
  • Theorem 4.4
  • ...and 57 more