Perfect taxon sampling and fixing taxon traceability: Introducing a class of phylogenetically decisive collections of taxon sets

Mareike Fischer; Janne Pott

Perfect taxon sampling and fixing taxon traceability: Introducing a class of phylogenetically decisive collections of taxon sets

Mareike Fischer, Janne Pott

TL;DR

This work addresses how to guarantee a unique supertree from multiple input taxon sets by introducing fixing taxa and fixing taxon traceable collections, a polynomial-time recognisable subclass that ensures phylogenetic decisiveness for unrooted trees. It contrasts this subclass with the broader but intractable decisiveness problem, proving that fixing taxon traceability implies decisiveness and providing a polynomial-time algorithm to detect it via 3-overlap graphs. The authors derive bounds on the number of input quadruples required for fixing taxon traceability and for decisiveness, correct a prior erroneous lower bound, and present constructions achieving near-optimal bounds; they also show decisiveness can occur without fixing traceability. A key contribution is the FixingTaxonTraceR package and extensive simulations that quantify the relationship between these concepts, offering practical guidance for designing phylogenetic sampling and supertytree construction with guaranteed outcomes in large data sets.

Abstract

Phylogenetically decisive collections of taxon sets have the property that if trees are chosen for each of their elements, as long as these trees are compatible, the resulting supertree is unique. This means that as long as the trees describing the phylogenetic relationships of the (input) species sets are compatible, they can only be combined into a common supertree in precisely one way. This setting is sometimes also referred to as \enquote{perfect taxon sampling}. While for rooted trees, the decision if a given set of input taxon sets is phylogenetically decisive can be made in polynomial time, the decision problem to determine whether a collection of taxon sets is phylogenetically decisive concerning \emph{unrooted} trees is unfortunately coNP-complete and therefore in practice hard to solve for large instances. This shows that recognizing such sets is often difficult. In this paper, we explain phylogenetic decisiveness and introduce a class of input taxon sets, namely so-called \emph{fixing taxon traceable} sets, which are guaranteed to be phylogenetically decisive and which can be recognized in polynomial time. Using both combinatorial approaches as well as simulations, we compare properties of fixing taxon traceability and phylogenetic decisiveness, e.g., by deriving lower and upper bounds for the number of quadruple sets (i.e., sets of 4-tuples) needed in the input set for each of these properties. In particular, we correct an erroneous lower bound concerning phylogenetic decisiveness from the literature. We have implemented the algorithm to determine if a given collection of taxon sets is fixing taxon traceable in \textsf{R} and made our software package \verb+FixingTaxonTraceR+ publicly available.

Perfect taxon sampling and fixing taxon traceability: Introducing a class of phylogenetically decisive collections of taxon sets

TL;DR

Abstract

Perfect taxon sampling and fixing taxon traceability: Introducing a class of phylogenetically decisive collections of taxon sets

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (8)

Theorems & Definitions (47)