Coconvex characters on collections of phylogenetic trees
Eva Czabarka, Steven Kelk, Vincent Moulton, Laszlo A. Szekely
TL;DR
This work investigates coconvexity of characters on collections of phylogenetic trees, seeking the minimal number of coconvex characters across collections and focusing on caterpillars. It develops a comprehensive two-tree theory with lower bounds for $c_{n,k}$ and $c_n$, exact results $c_{n,k}=\binom{n}{k-1}$ for $k\le \lceil n/3\rceil$, and asymptotic upper bounds; it extends to $t\ge3$ trees via maximum agreement subtrees and derives per-$k$ lower bounds in regimes where common coconvex partitions must exist. A key outcome is the introduction of a one-parameter family of tree metrics $d_k$ that interpolate Robinson-Foulds ($d_2$) and quartet ($d_{n-2}$) distances, linking coconvexity to tree-space geometry and diameter bounds. The results open directions for counting coconvex structures, understanding multi-tree tree spaces in phylogenomics, and developing efficient distance-based tools for phylogenetic analysis.
Abstract
In phylogenetics, a key problem is to construct evolutionary trees from collections of characters where, for a set X of species, a character is simply a function from X onto a set of states. In this context, a key concept is convexity, where a character is convex on a tree with leaf set X if the collection of subtrees spanned by the leaves of the tree that have the same state are pairwise disjoint. Although collections of convex characters on a single tree have been extensively studied over the past few decades, very little is known about coconvex characters, that is, characters that are simultaneously convex on a collection of trees. As a starting point to better understand coconvexity, in this paper we prove a number of extremal results for the following question: What is the minimal number of coconvex characters on a collection of n-leaved trees taken over all collections of size t >= 2, also if we restrict to coconvex characters which map to k states? As an application of coconvexity, we introduce a new one-parameter family of tree metrics, which range between the coarse Robinson-Foulds distance and the much finer quartet distance. We show that bounds on the quantities in the above question translate into bounds for the diameter of the tree space for the new distances. Our results open up several new interesting directions and questions which have potential applications to, for example, tree spaces and phylogenomics.
