Table of Contents
Fetching ...

Split-or-decompose: Improved FPT branching algorithms for maximum agreement forests

David Mestel, Steven Chaplick, Steven Kelk, Ruben Meuwese

Abstract

Phylogenetic trees are leaf-labelled trees used to model the evolution of species. In practice it is not uncommon to obtain two topologically distinct trees for the same set of species, and this motivates the use of distance measures to quantify dissimilarity. A well-known measure is the maximum agreement forest (MAF): a minimum-size partition of the leaf labels which splits both trees into the same set of disjoint, leaf-labelled subtrees (up to isomorphism after suppressing degree-2 vertices). Computing such a MAF is NP-hard and so considerable effort has been invested in finding FPT algorithms, parameterised by $k$, the number of components of a MAF. The state of the art has been unchanged since 2015, with running times of $O^*(3^k)$ for unrooted trees and $O^*(2.3431^k)$ for rooted trees. In this work we present improved algorithms for both the unrooted and rooted cases, with runtimes $O^*(2.846^k)$ and $O^*(2.3391^k)$ respectively. The key to our improvement is a novel branching strategy in which we show that any overlapping components obtained on the way to a MAF can be `split' by a branching rule with favourable branching factor, and then the problem can be decomposed into disjoint subproblems to be solved separately. We expect that this technique may be more widely applicable to other problems in algorithmic phylogenetics.

Split-or-decompose: Improved FPT branching algorithms for maximum agreement forests

Abstract

Phylogenetic trees are leaf-labelled trees used to model the evolution of species. In practice it is not uncommon to obtain two topologically distinct trees for the same set of species, and this motivates the use of distance measures to quantify dissimilarity. A well-known measure is the maximum agreement forest (MAF): a minimum-size partition of the leaf labels which splits both trees into the same set of disjoint, leaf-labelled subtrees (up to isomorphism after suppressing degree-2 vertices). Computing such a MAF is NP-hard and so considerable effort has been invested in finding FPT algorithms, parameterised by , the number of components of a MAF. The state of the art has been unchanged since 2015, with running times of for unrooted trees and for rooted trees. In this work we present improved algorithms for both the unrooted and rooted cases, with runtimes and respectively. The key to our improvement is a novel branching strategy in which we show that any overlapping components obtained on the way to a MAF can be `split' by a branching rule with favourable branching factor, and then the problem can be decomposed into disjoint subproblems to be solved separately. We expect that this technique may be more widely applicable to other problems in algorithmic phylogenetics.
Paper Structure (13 sections, 6 theorems, 4 figures)

This paper contains 13 sections, 6 theorems, 4 figures.

Key Result

Lemma 1

Let $T$ be a binary phylogenetic tree and $(Y,Z)$ a non-trivial bipartition of the labels of $T$. Then there exists a splitting core $C$ such that $\sum_{K\in C} 2^{-|K|} \leq \frac{1}{2}$.

Figures (4)

  • Figure 1: Left: two unrooted binary phylogenetic trees on taxa $X=\{a,b,c,d,e\}$. An uMAF for these two trees has 2 components, e.g. $\{\{a,b,c\}, \{d,e\}\}$. Right: two rooted binary phylogenetic trees on the same set of taxa. An rMAF for these two trees has 3 components, e.g. $\{\{a,c\}, \{b\}, \{d,e\}\}$.
  • Figure 2: An illustration of Chen's branching strategy for unrooted agreement forests when $\{a,b\}$ is a cherry in $T$ and $a$ and $b$ are together in a component $B \in F'$. Here $t=3$ and the resulting recurrence is $2T(k-1)+3T(k-2)$.
  • Figure 3: An illustration of Whidden's branching strategy for rooted agreement forests when $\{a,b\}$ is a cherry in $T$ and $a$ and $b$ are together in a component $B \in F'$. Here $t=3$, and the resulting recurrence is $2T(k-1)+T(k-3)$.
  • Figure 4:

Theorems & Definitions (6)

  • Lemma 1
  • Lemma 2
  • Lemma 3
  • Lemma 3
  • Lemma 3
  • Lemma 3