Projected Gradient Descent Method for Tropical Principal Component Analysis over Tree Space
Ruriko Yoshida
TL;DR
The paper addresses visualizing gene-tree distributions in the space of equidistant phylogenetic trees by extending tropical PCA to estimate the tropical principal polytope via a projected gradient descent method. It builds on tropical geometry in the tropical projective torus, utilizes a tropical projection onto ultrametric/tree space, and optimizes a sum of tropical distances to obtain a best-fit tropical polytope. The authors derive subgradients, implement a gradient-descent scheme with projection to maintain feasibility, and demonstrate improved objective values with faster runtime compared to MCMC on both simulated multispecies-coalescent data and an empirical eight-species dataset. This approach provides a computationally efficient alternative for summarizing hierarchical tree-structured genomic data in a tropical framework, with practical implications for phylogenomics analyses.
Abstract
In 2019, Yoshida et al. developed tropical Principal Component Analysis (PCA), that is, an analogue of the classical PCA in the setting of tropical geometry and applied it to visualize a set of gene trees over a space of phylogenetic trees which is an union of lower dimensional polyhedral cones in an Euclidean space with its dimension $m(m-1)/2$ where $m$ is the number of leaves. In this paper, we introduce a projected gradient descent method to estimate the tropical principal polytope over the space of phylogenetic trees and we apply it to apicomplexa dataset. With computational experiment against Markov Chain Monte Carlo (MCMC) samplers, we show that our projected gradient descent has a lower sum of tropical distances between observations and their projections on an estimated best-fit tropical polytope compared with the MCMC approach proposed by Page et al.~in 2020.
