A strengthened bound on the number of states required to characterize maximum parsimony distance

Mareike Fischer; Steven Kelk; Sofia Vazquez Alferez

A strengthened bound on the number of states required to characterize maximum parsimony distance

Mareike Fischer, Steven Kelk, Sofia Vazquez Alferez

TL;DR

This work addresses the problem of bounding the number of states needed in a convex character to realize the maximum parsimony distance between two unrooted binary phylogenetic trees. By developing an adjacency theorem and leveraging Fitch's algorithm, the authors prove an improved upper bound of $2\,d_{ ext{MP}}(T_1,T_2)$ states and establish a matching-style lower bound of $k+1$ states in some cases. They provide a constructive lower-bound family and an empirical study on 644 tree pairs showing that, in practice, far fewer states (average about $0.44\,d_{ ext{MP}}$) are typically sufficient. The results have algorithmic implications for exact computation of $d_{ ext{MP}}$, suggesting more efficient enumeration of convex characters. They also discuss the gap to be closed toward a conjectured bound of $d_{ ext{MP}}+1$ and outline directions for future kernelization-related improvements.

Abstract

In this article we prove that the distance $d_{\mathrm{MP}}(T_1,T_2) = k$ between two unrooted binary phylogenetic trees $T_1, T_2$ on the same set of taxa can be defined by a character that is convex on one of $T_1, T_2$ and which has at most $2k$ states. This significantly improves upon the previous bound of $7k-5$ states. We also show that for every $k \geq 1$ there exist two trees $T_1, T_2$ with $d_{\mathrm{MP}}(T_1,T_2) = k$ such that at least $k+1$ states are necessary in any character that achieves this distance and which is convex on one of $T_1, T_2$. We augment these lower and upper bounds with an empirical analysis which shows that in practice significantly fewer than $k+1$ states are usually required.

A strengthened bound on the number of states required to characterize maximum parsimony distance

TL;DR

states and establish a matching-style lower bound of

states in some cases. They provide a constructive lower-bound family and an empirical study on 644 tree pairs showing that, in practice, far fewer states (average about

) are typically sufficient. The results have algorithmic implications for exact computation of

, suggesting more efficient enumeration of convex characters. They also discuss the gap to be closed toward a conjectured bound of

and outline directions for future kernelization-related improvements.

Abstract

In this article we prove that the distance

between two unrooted binary phylogenetic trees

on the same set of taxa can be defined by a character that is convex on one of

and which has at most

states. This significantly improves upon the previous bound of

states. We also show that for every

there exist two trees

with

such that at least

states are necessary in any character that achieves this distance and which is convex on one of

. We augment these lower and upper bounds with an empirical analysis which shows that in practice significantly fewer than

states are usually required.

A strengthened bound on the number of states required to characterize maximum parsimony distance

TL;DR

Abstract

A strengthened bound on the number of states required to characterize maximum parsimony distance

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (6)

Theorems & Definitions (26)