Table of Contents
Fetching ...

Analysis of Phylogeny Tracking Algorithms for Serial and Multiprocess Applications

Matthew Andres Moreno, Santiago Rodriguez Papa, Emily Dolson

TL;DR

This work formally describes procedures for phylogenetic analysis in both serial and distributed computing scenarios, and introduces a trie-based phylogenetic reconstruction approach forhereditary stratigraphy"genome annotations.

Abstract

Since the advent of modern bioinformatics, the challenging, multifaceted problem of reconstructing phylogenetic history from biological sequences has hatched perennial statistical and algorithmic innovation. Studies of the phylogenetic dynamics of digital, agent-based evolutionary models motivate a peculiar converse question: how to best engineer tracking to facilitate fast, accurate, and memory-efficient lineage reconstructions? Here, we formally describe procedures for phylogenetic analysis in both serial and distributed computing scenarios. With respect to the former, we demonstrate reference-counting-based pruning of extinct lineages. For the latter, we introduce a trie-based phylogenetic reconstruction approach for "hereditary stratigraphy" genome annotations. This process allows phylogenetic relationships between genomes to be inferred by comparing their similarities, akin to reconstruction of natural history from biological DNA sequences. Phylogenetic analysis capabilities significantly advance distributed agent-based simulations as a tool for evolutionary research, and also benefit application-oriented evolutionary computing. Such tracing could extend also to other digital artifacts that proliferate through replication, like digital media and computer viruses.

Analysis of Phylogeny Tracking Algorithms for Serial and Multiprocess Applications

TL;DR

This work formally describes procedures for phylogenetic analysis in both serial and distributed computing scenarios, and introduces a trie-based phylogenetic reconstruction approach forhereditary stratigraphy"genome annotations.

Abstract

Since the advent of modern bioinformatics, the challenging, multifaceted problem of reconstructing phylogenetic history from biological sequences has hatched perennial statistical and algorithmic innovation. Studies of the phylogenetic dynamics of digital, agent-based evolutionary models motivate a peculiar converse question: how to best engineer tracking to facilitate fast, accurate, and memory-efficient lineage reconstructions? Here, we formally describe procedures for phylogenetic analysis in both serial and distributed computing scenarios. With respect to the former, we demonstrate reference-counting-based pruning of extinct lineages. For the latter, we introduce a trie-based phylogenetic reconstruction approach for "hereditary stratigraphy" genome annotations. This process allows phylogenetic relationships between genomes to be inferred by comparing their similarities, akin to reconstruction of natural history from biological DNA sequences. Phylogenetic analysis capabilities significantly advance distributed agent-based simulations as a tool for evolutionary research, and also benefit application-oriented evolutionary computing. Such tracing could extend also to other digital artifacts that proliferate through replication, like digital media and computer viruses.
Paper Structure (25 sections, 3 theorems, 2 equations, 2 algorithms)

This paper contains 25 sections, 3 theorems, 2 equations, 2 algorithms.

Key Result

Theorem 1

Naive Perfect Tracking Time Complexity The naive perfect tracking algorithm can be implemented in constant time ($\mathcal{O}(1)$) per birth event.

Theorems & Definitions (6)

  • Theorem 1
  • proof
  • Theorem 2
  • proof
  • Theorem 3
  • proof