Table of Contents
Fetching ...

Estimating Dark Matter Halo Masses in Simulated Galaxy Clusters with Graph Neural Networks

Nikhil Garuda, John F. Wu, Dylan Nelson, Annalisa Pillepich

TL;DR

A graph neural network model for predicting halo masses from stellar mass in simulated galaxy clusters using data from the IllustrisTNG simulation suite achieves superior predictive performance compared to other baseline models tested.

Abstract

Galaxies grow and evolve in dark matter halos. Because dark matter is not visible, galaxies' halo masses ($\rm{M}_{\rm{halo}}$) must be inferred indirectly. We present a graph neural network (GNN) model for predicting $\rm{M}_{\rm{halo}}$ from stellar mass ($\rm{M}_{*}$) in simulated galaxy clusters using data from the IllustrisTNG simulation suite. Unlike traditional machine learning models like random forests, our GNN captures the information-rich substructure of galaxy clusters by using spatial and kinematic relationships between galaxy neighbour. A GNN model trained on the TNG-Cluster dataset and independently tested on the TNG300 simulation achieves superior predictive performance compared to other baseline models we tested. Future work will extend this approach to different simulations and real observational datasets to further validate the GNN model's ability to generalise.

Estimating Dark Matter Halo Masses in Simulated Galaxy Clusters with Graph Neural Networks

TL;DR

A graph neural network model for predicting halo masses from stellar mass in simulated galaxy clusters using data from the IllustrisTNG simulation suite achieves superior predictive performance compared to other baseline models tested.

Abstract

Galaxies grow and evolve in dark matter halos. Because dark matter is not visible, galaxies' halo masses () must be inferred indirectly. We present a graph neural network (GNN) model for predicting from stellar mass () in simulated galaxy clusters using data from the IllustrisTNG simulation suite. Unlike traditional machine learning models like random forests, our GNN captures the information-rich substructure of galaxy clusters by using spatial and kinematic relationships between galaxy neighbour. A GNN model trained on the TNG-Cluster dataset and independently tested on the TNG300 simulation achieves superior predictive performance compared to other baseline models we tested. Future work will extend this approach to different simulations and real observational datasets to further validate the GNN model's ability to generalise.

Paper Structure

This paper contains 14 sections, 5 figures, 2 tables.

Figures (5)

  • Figure 1: Flow diagram of GNN architecture used for halo mass prediction. The GNN processes node features ($x_i$, $x_j$) and edge features ($\epsilon_{ij}$) through multiple unshared layers, where each layer applies learnable functions, $\phi$, which are implemented as MLPs. These unshared layers operate in parallel across the graph structure. A pooling layer then aggregates ($\bigoplus$) the information from these interactions back into each node. Subsequent repetitions of these GNN layers can give it more representational power. Finally, the output MLP, $\psi$, combines node features and aggregated edge features to predict each node's halo mass.
  • Figure 2: Predicted versus true ${\rm{M}_{\rm{halo}}}$ for the TNG-Cluster validation set, coloured by distance from cluster center.
  • Figure 3: Predicted versus true ${\rm{M}_{\rm{halo}}}$ for the TNG300 test set, coloured by distance from cluster center.
  • Figure 5: Validation set RMSE as a function of distance from cluster center. Results shown for the GNN (blue) and RF with ${\rm{M}_{*}}$ and $\Delta_G$ (orange).
  • Figure 6: Spatial distribution of halos within the TNG-Cluster simulation. The middle panel shows the full simulation, and the left and right panels highlight two example galaxy clusters. The boundaries of these clusters are marked as blue and red boxes in the middle panel.