Table of Contents
Fetching ...

Projected Gradient Descent Method for Tropical Principal Component Analysis over Tree Space

Ruriko Yoshida

TL;DR

The paper addresses visualizing gene-tree distributions in the space of equidistant phylogenetic trees by extending tropical PCA to estimate the tropical principal polytope via a projected gradient descent method. It builds on tropical geometry in the tropical projective torus, utilizes a tropical projection onto ultrametric/tree space, and optimizes a sum of tropical distances to obtain a best-fit tropical polytope. The authors derive subgradients, implement a gradient-descent scheme with projection to maintain feasibility, and demonstrate improved objective values with faster runtime compared to MCMC on both simulated multispecies-coalescent data and an empirical eight-species dataset. This approach provides a computationally efficient alternative for summarizing hierarchical tree-structured genomic data in a tropical framework, with practical implications for phylogenomics analyses.

Abstract

In 2019, Yoshida et al. developed tropical Principal Component Analysis (PCA), that is, an analogue of the classical PCA in the setting of tropical geometry and applied it to visualize a set of gene trees over a space of phylogenetic trees which is an union of lower dimensional polyhedral cones in an Euclidean space with its dimension $m(m-1)/2$ where $m$ is the number of leaves. In this paper, we introduce a projected gradient descent method to estimate the tropical principal polytope over the space of phylogenetic trees and we apply it to apicomplexa dataset. With computational experiment against Markov Chain Monte Carlo (MCMC) samplers, we show that our projected gradient descent has a lower sum of tropical distances between observations and their projections on an estimated best-fit tropical polytope compared with the MCMC approach proposed by Page et al.~in 2020.

Projected Gradient Descent Method for Tropical Principal Component Analysis over Tree Space

TL;DR

The paper addresses visualizing gene-tree distributions in the space of equidistant phylogenetic trees by extending tropical PCA to estimate the tropical principal polytope via a projected gradient descent method. It builds on tropical geometry in the tropical projective torus, utilizes a tropical projection onto ultrametric/tree space, and optimizes a sum of tropical distances to obtain a best-fit tropical polytope. The authors derive subgradients, implement a gradient-descent scheme with projection to maintain feasibility, and demonstrate improved objective values with faster runtime compared to MCMC on both simulated multispecies-coalescent data and an empirical eight-species dataset. This approach provides a computationally efficient alternative for summarizing hierarchical tree-structured genomic data in a tropical framework, with practical implications for phylogenomics analyses.

Abstract

In 2019, Yoshida et al. developed tropical Principal Component Analysis (PCA), that is, an analogue of the classical PCA in the setting of tropical geometry and applied it to visualize a set of gene trees over a space of phylogenetic trees which is an union of lower dimensional polyhedral cones in an Euclidean space with its dimension where is the number of leaves. In this paper, we introduce a projected gradient descent method to estimate the tropical principal polytope over the space of phylogenetic trees and we apply it to apicomplexa dataset. With computational experiment against Markov Chain Monte Carlo (MCMC) samplers, we show that our projected gradient descent has a lower sum of tropical distances between observations and their projections on an estimated best-fit tropical polytope compared with the MCMC approach proposed by Page et al.~in 2020.

Paper Structure

This paper contains 9 sections, 9 theorems, 42 equations, 3 figures.

Key Result

Theorem 10

Suppose we have an equidistant tree $T$ with a leaf label set $[m]$ and $D$ as its tree metric. Then, $D$ is an ultrametric if and only if $T$ is an equidistant tree. In addition, we can reconstruct $T$ from $D$ uniquely.

Figures (3)

  • Figure 1: An equidistant tree with $[4]$ from Example \ref{['eg:equid']}. Its height of the tree is $1$.
  • Figure 2: Side-by-Side Boxplots for SE for each method and each different ratio. We repeat computation 10 times for each ration and each method.
  • Figure 3: Left: Estimated second order tropical principal polytope. Right: Each color represents a tree topology. The number inside of each branket is the frequency of the tree topology. 1 presents "Pv", 2 represents "Pf", 3 represents "Tg", 4 represents "Et", 5 represents "Cp", 6 represents "Ta", 7 represents "Bb" and 8 represents "Tt".

Theorems & Definitions (33)

  • Definition 1: Tropical Arithmetic Operations
  • Remark 1
  • Definition 2: Tropical Scalar Multiplication and Vector Addition
  • Definition 3: Generalized Hilbert Projective Metric
  • Remark 2
  • Definition 4
  • Definition 5: Max-tropical Hyperplane ETC
  • Definition 6: Min-tropical Hyperplane ETC
  • Remark 3
  • Definition 7: Max-tropical Sectors from Section 5.5 in joswigBook
  • ...and 23 more