Table of Contents
Fetching ...

Hierarchical Conditioning of Diffusion Models Using Tree-of-Life for Studying Species Evolution

Mridul Khurana, Arka Daw, M. Maruf, Josef C. Uyeda, Wasila Dahdul, Caleb Charpentier, Yasin Bakış, Henry L. Bart, Paula M. Mabee, Hilmar Lapp, James P. Balhoff, Wei-Lun Chao, Charles Stewart, Tanya Berger-Wolf, Anuj Karpatne

TL;DR

Phylo-Diffusion addresses how to visualize evolutionary trait changes from images by conditioning latent diffusion models on a four-level hierarchical embedding (HIER-Embed) derived from a discretized phylogenetic tree. The framework introduces trait masking and trait swapping to perturb embeddings in biologically meaningful ways, enabling observation of trait evolution across lineage branches. Empirical results on fishes and birds show that HIER-Embed captures phylogenetic distances, yields competitive image quality, and reveals interpretable trait changes aligned with evolutionary hypotheses. This approach offers a novel, image-based avenue for studying evolution, facilitating automated discovery of synapomorphies and rapid exploration of trait dynamics across the tree of life, while highlighting future directions for continuous, uncertainty-aware phylogenetic modeling.

Abstract

A central problem in biology is to understand how organisms evolve and adapt to their environment by acquiring variations in the observable characteristics or traits of species across the tree of life. With the growing availability of large-scale image repositories in biology and recent advances in generative modeling, there is an opportunity to accelerate the discovery of evolutionary traits automatically from images. Toward this goal, we introduce Phylo-Diffusion, a novel framework for conditioning diffusion models with phylogenetic knowledge represented in the form of HIERarchical Embeddings (HIER-Embeds). We also propose two new experiments for perturbing the embedding space of Phylo-Diffusion: trait masking and trait swapping, inspired by counterpart experiments of gene knockout and gene editing/swapping. Our work represents a novel methodological advance in generative modeling to structure the embedding space of diffusion models using tree-based knowledge. Our work also opens a new chapter of research in evolutionary biology by using generative models to visualize evolutionary changes directly from images. We empirically demonstrate the usefulness of Phylo-Diffusion in capturing meaningful trait variations for fishes and birds, revealing novel insights about the biological mechanisms of their evolution.

Hierarchical Conditioning of Diffusion Models Using Tree-of-Life for Studying Species Evolution

TL;DR

Phylo-Diffusion addresses how to visualize evolutionary trait changes from images by conditioning latent diffusion models on a four-level hierarchical embedding (HIER-Embed) derived from a discretized phylogenetic tree. The framework introduces trait masking and trait swapping to perturb embeddings in biologically meaningful ways, enabling observation of trait evolution across lineage branches. Empirical results on fishes and birds show that HIER-Embed captures phylogenetic distances, yields competitive image quality, and reveals interpretable trait changes aligned with evolutionary hypotheses. This approach offers a novel, image-based avenue for studying evolution, facilitating automated discovery of synapomorphies and rapid exploration of trait dynamics across the tree of life, while highlighting future directions for continuous, uncertainty-aware phylogenetic modeling.

Abstract

A central problem in biology is to understand how organisms evolve and adapt to their environment by acquiring variations in the observable characteristics or traits of species across the tree of life. With the growing availability of large-scale image repositories in biology and recent advances in generative modeling, there is an opportunity to accelerate the discovery of evolutionary traits automatically from images. Toward this goal, we introduce Phylo-Diffusion, a novel framework for conditioning diffusion models with phylogenetic knowledge represented in the form of HIERarchical Embeddings (HIER-Embeds). We also propose two new experiments for perturbing the embedding space of Phylo-Diffusion: trait masking and trait swapping, inspired by counterpart experiments of gene knockout and gene editing/swapping. Our work represents a novel methodological advance in generative modeling to structure the embedding space of diffusion models using tree-based knowledge. Our work also opens a new chapter of research in evolutionary biology by using generative models to visualize evolutionary changes directly from images. We empirically demonstrate the usefulness of Phylo-Diffusion in capturing meaningful trait variations for fishes and birds, revealing novel insights about the biological mechanisms of their evolution.
Paper Structure (39 sections, 4 equations, 24 figures, 12 tables)

This paper contains 39 sections, 4 equations, 24 figures, 12 tables.

Figures (24)

  • Figure 1: Overview of Phylo-Diffusion framework. Every species in the tree of life (phylogenetic tree) is encoded to a HIERarchical Embedding (HIER-Embed) comprising of four vectors (one for each phylogenetic level), which is used to condition a latent diffusion model to generate synthetic images of the species. By structuring the embedding space with phylogenetic knowledge, Phylo-Diffusion enables visualization of changes in the evolutionary traits of a species (circled pink) upon perturbing its embedding.
  • Figure 2: Schematic diagrams of the two proposed experiments for discovering evolutionary traits using Phylo-Diffusion.
  • Figure 3: Comparing the quality of synthetic images generated by different conditioning mechanisms in LDMs. Every row corresponds to a different species and we show two samples per species for every conditioning mechanism. The order of species from top to bottom is Cyprinus carpio, Notropis hudsonius, Lepomis auritus, Noturus exilis, and Gambusia affinis.
  • Figure 4: Comparing Cosine distances in the embedding space of species for varying conditioning mechanisms.
  • Figure 5: Left: class probability distributions of images generated by using embeddings at all four levels for two species Lepomis gulosus and Lepomis macrochirus (shown in green) that are part of the same sub-tree till level 3. Right: class probability distributions of images generated by masking level 4 (descendant species that have common ancestry till level 3 are highlighted in green)
  • ...and 19 more figures