FLORAH-Tree: Emulating Dark Matter Halo Merger Trees with Graph Generative Models
Tri Nguyen, Chirag Modi, Siddharth Mishra-Sharma, L. Y. Aaron Yung, Rachel S. Somerville
TL;DR
FLORAH-Tree addresses the challenge of generating complete dark matter halo merger trees with environmental information, extending the prior FLORAH model to capture full branching histories. It combines an RNN-based history encoder, a multinomial classifier for the number of progenitors, and a neural density estimator with normalizing flows to autoregressively generate progenitor properties conditioned on history and redshift. The method is trained on VSMDPL N-body merger trees and validated against both the simulation and EPS-based trees, showing excellent reproduction of progenitor mass distributions and merger rates, and yielding galaxy-halo scaling relations in close agreement with the reference simulation when run through the Santa Cruz SAM. FLORAH-Tree provides a fast, scalable alternative to full simulations for structure formation studies and enables environmentally informed tree generation with potential extensions to multi-cosmology conditioning and lightcone applications.
Abstract
Merger trees track the hierarchical assembly of dark matter halos across cosmic time and serve as essential inputs for semi-analytic models of galaxy formation. However, conventional methods for constructing merger trees rely on ad-hoc assumptions and are unable to incorporate environmental information. Nguyen et al. (2024) introduced FLORAH, a generative model based on recurrent neural networks and normalizing flows, for modeling main progenitor branches of merger trees. In this work, we extend this model, now referred to as FLORAH-Tree, to generate complete merger trees by representing them as graph structures that capture the full branching hierarchy. We trained FLORAH-Tree on merger trees extracted from the Very Small MultiDark Planck cosmological N-body simulation. To validate our approach, we compared the generated merger trees with both the original simulation data and with semi-analytic trees produced using the Extended Press-Schechter (EPS) formalism. We show that FLORAH-Tree accurately reproduces key merger rate statistics across a wide range of mass and redshift, outperforming the conventional EPS-based approach. We demonstrate its utility by applying the Santa Cruz semi-analytic model (SAM) to generated trees and showing that the resulting galaxy-halo scaling relations, such as the stellar-to-halo-mass relation and supermassive black hole mass-halo mass relation, closely match those from applying the SAM to trees extracted directly from the simulation. FLORAH-Tree provides a computationally efficient method for generating merger trees that maintain the statistical fidelity of N-body simulations.
