Table of Contents
Fetching ...

Characterizing semi-directed phylogenetic networks and their multi-rootable variants

Niels Holtgrefe, Katharina T. Huber, Leo van Iersel, Mark Jones, Vincent Moulton

TL;DR

This work addresses the problem of giving explicit, checkable characterizations for (multi-)semi-directed phylogenetic networks—mixed graphs obtained by semi-deorientation of rooted networks—and how such networks root within established rooted classes. It extends foundational rooted-network tools (cherry picking sequences, omnians, path partitions) to the semi-directed setting, deriving necessary and sufficient conditions for a mixed graph to be a (multi-)semi-directed network and to admit rootings contained in tree-child, orchard, tree-based, or forest-based classes. Key contributions include explicit characterizations (with linear-time algorithms) for root configurations and feasibility of rootings, and deep connections between omnians, cherry-picking, and path partitions in the semi-directed context, enabling both theoretical insights and practical analyses. The results advance the theoretical foundation for semi-directed networks and support computational analyses and algebraic evolutionary modeling that incorporate semi-directed topologies.

Abstract

In evolutionary biology, phylogenetic networks are graphs that provide a flexible framework for representing complex evolutionary histories that involve reticulate evolutionary events. Recently phylogenetic studies have started to focus on a special class of such networks called semi-directed networks. These graphs are defined as mixed graphs that can be obtained by de-orienting some of the arcs in some rooted phylogenetic network, that is, a directed acyclic graph whose leaves correspond to a collection of species and that has a single source or root vertex. However, this definition of semi-directed networks is implicit in nature since it is not clear when a mixed-graph enjoys this property or not. In this paper, we introduce novel, explicit mathematical characterizations of semi-directed networks, and also multi-semi-directed networks, that is, mixed graphs that can be obtained from directed phylogenetic networks that may have more than one root. In addition, through extending foundational tools from the theory of rooted networks into the semi-directed setting - such as cherry picking sequences, omnians, and path partitions - we characterize when a (multi-)semi-directed network can be obtained by de-orienting some rooted network that is contained in one of the well-known classes of tree-child, orchard, tree-based or forest-based networks. These results address structural aspects of (multi-)semi-directed networks and pave the way to improved theoretical and computational analyses of such networks, for example, within the development of algebraic evolutionary models that are based on such networks.

Characterizing semi-directed phylogenetic networks and their multi-rootable variants

TL;DR

This work addresses the problem of giving explicit, checkable characterizations for (multi-)semi-directed phylogenetic networks—mixed graphs obtained by semi-deorientation of rooted networks—and how such networks root within established rooted classes. It extends foundational rooted-network tools (cherry picking sequences, omnians, path partitions) to the semi-directed setting, deriving necessary and sufficient conditions for a mixed graph to be a (multi-)semi-directed network and to admit rootings contained in tree-child, orchard, tree-based, or forest-based classes. Key contributions include explicit characterizations (with linear-time algorithms) for root configurations and feasibility of rootings, and deep connections between omnians, cherry-picking, and path partitions in the semi-directed context, enabling both theoretical insights and practical analyses. The results advance the theoretical foundation for semi-directed networks and support computational analyses and algebraic evolutionary modeling that incorporate semi-directed topologies.

Abstract

In evolutionary biology, phylogenetic networks are graphs that provide a flexible framework for representing complex evolutionary histories that involve reticulate evolutionary events. Recently phylogenetic studies have started to focus on a special class of such networks called semi-directed networks. These graphs are defined as mixed graphs that can be obtained by de-orienting some of the arcs in some rooted phylogenetic network, that is, a directed acyclic graph whose leaves correspond to a collection of species and that has a single source or root vertex. However, this definition of semi-directed networks is implicit in nature since it is not clear when a mixed-graph enjoys this property or not. In this paper, we introduce novel, explicit mathematical characterizations of semi-directed networks, and also multi-semi-directed networks, that is, mixed graphs that can be obtained from directed phylogenetic networks that may have more than one root. In addition, through extending foundational tools from the theory of rooted networks into the semi-directed setting - such as cherry picking sequences, omnians, and path partitions - we characterize when a (multi-)semi-directed network can be obtained by de-orienting some rooted network that is contained in one of the well-known classes of tree-child, orchard, tree-based or forest-based networks. These results address structural aspects of (multi-)semi-directed networks and pave the way to improved theoretical and computational analyses of such networks, for example, within the development of algebraic evolutionary models that are based on such networks.

Paper Structure

This paper contains 12 sections, 20 theorems, 6 equations, 15 figures.

Key Result

Lemma 1

Suppose $G=(V,E,A)$ is a connected mixed graph with $|V|\geq 3$. Then $G$ contains either a cherry or a leaf reticulation (or both) if the following properties hold:

Figures (15)

  • Figure 1: Five mixed graphs $N_1,N_2,G,D_1,D_2$. Mixed graph $N_1$ is a semi-directed network since it is the semi-deorientation of, for example, the rooted network $D_1$ illustrated below it. Mixed graph $N_2$ is a multi-semi-directed network since it is the semi-deorientation of, for example, the multi-rooted network $D_2$ illustrated below it. However, it can be shown that $N_2$ is not a semi-directed network (since two roots are needed). The mixed graph $G$ is not semi-directed nor multi-semi-directed.
  • Figure 2: Three semi-directed networks $N_1$, $N_2$ and $N_3$. Semi-directed network $N_1$ is strongly tree-based since each of its rootings is tree-based. Semi-directed network $N_2$ is weakly tree-based, but not strongly, since the rooting obtained by directing all edges away from the vertex $r_1$ is tree-based, but the rooting obtained in a similar way using vertex $r_2$ is not. Semi-directed network $N_3$ is not weakly tree-based since it has no rooting that is tree-based.
  • Figure 3: Left: A $2$-semi-directed network $N$ on $\{a,b,\ldots, e\}$ with set of reticulations $\{r_1,r_2,r_3,r_4\}$. The sequence $(r_1,v_2,v_5,r_2,v_3,v_4,r_3,r_4,g)$ is a $\wedge$-path of $N$ and the sequence $C=(r_2,v_3,v_4,r_3,r_4,v_7,r_2 )$ is a cycle of $N$. The vertices $r_2$ and $r_4$ are sinks of $C$ whereas $r_3$ is not. The path $(a,v_1,v_3,v_4,v_7)$ is an example of an edge-path. The leaves $a$ and $b$ form a cherry, leaves $d$ and $g$ are reticulation leaves, while $r_1$ and $r_4$ are leaf reticulations. The vertex sets of the source components are $\{a,b,v_1,v_3,v_4,v_7\}$ and $\{e\}$ while the sink components have vertex sets $\{d,r_1\}$ and $\{g,r_4\}$. Right: A rooting $D$ of $N$ in the form of a $2$-rooted network with roots $\rho_1$ and $\rho_2$.
  • Figure 4: Two mixed graphs $G$ and $G'$ that, by Theorem \ref{['thm:semi-directed-cycle']}, are not multi-semi-directed networks. The reason is that $G$ contains the semi-directed cycle $(v_1,v_2,v_3,v_4,v_1)$ while $G'$ contains the non-trivial edge-path $(v_1,v_2,v_3)$ and $v_1$ and $v_3$ are reticulations.
  • Figure 5: Left: A mixed graph $G_1$ which, by Corollary \ref{['cor:semi-directed']}, is not a semi-directed network since there is no $\wedge$-path between $v_1$ and $v_2$. Right: A mixed graph $G_2$ that, by Corollary \ref{['cor:semi-directed']}, is not a semi-directed network since it contains a cycle $(v_1,v_2,v_3,v_4,v_5,v_1)$ without a sink.
  • ...and 10 more figures

Theorems & Definitions (40)

  • Lemma 1
  • proof
  • Theorem 1
  • proof
  • Lemma 2
  • proof
  • Theorem 2
  • proof
  • Corollary 1
  • proof
  • ...and 30 more