HaloFlow II: Robust Galaxy Halo Mass Inference with Domain Adaptation

Nikhil Garuda; ChangHoon Hahn; Connor Bottrell; Khee-Gan Lee

HaloFlow II: Robust Galaxy Halo Mass Inference with Domain Adaptation

Nikhil Garuda, ChangHoon Hahn, Connor Bottrell, Khee-Gan Lee

Abstract

Precise halo mass ($M_h$) measurements are crucial for cosmology and galaxy formation. HaloFlow introduced a simulation-based inference (SBI) framework that uses state-of-the-art simulated galaxy images to precisely infer $M_h$. However, for HaloFlow to be applied to observations, it must be generalizable even when the underlying galaxy formation physics differ from those in the simulations on which it was trained. Without this generalization, HaloFlow produces biased and overconfident $M_h$ posteriors when applied to simulations with different physics. We introduce HaloFlow$^{\rm DA}$, an extension of HaloFlow that integrates domain adaptation (DA) with SBI to mitigate these cross-simulation shifts. Using synthetic galaxy images forward-modeled from the IllustrisTNG, EAGLE, and SIMBA simulations, we test two DA methods: Domain-Adversarial Neural Networks (DANN) and Maximum Mean Discrepancy (MMD). Incorporating DA significantly reduces bias and improves calibration, with MMD achieving the most stable performance, lowering the normalized residual metric, $β$, by an average of 31% and up to 57% when trained and tested on different simulations. Overall, HaloFlow$^{\rm DA}$ produces more robust, less biased with similar precision, $M_h$ constraints than the standard approach using the stellar-to-halo mass relation. HaloFlow$^{\rm DA}$ enables consistent, simulation-trained inference models to generalize across domains, establishing a foundation for robust $M_h$ inference from real HSC-SSP observations.

HaloFlow II: Robust Galaxy Halo Mass Inference with Domain Adaptation

Abstract

Precise halo mass (

) measurements are crucial for cosmology and galaxy formation. HaloFlow introduced a simulation-based inference (SBI) framework that uses state-of-the-art simulated galaxy images to precisely infer

. However, for HaloFlow to be applied to observations, it must be generalizable even when the underlying galaxy formation physics differ from those in the simulations on which it was trained. Without this generalization, HaloFlow produces biased and overconfident

posteriors when applied to simulations with different physics. We introduce HaloFlow

, an extension of HaloFlow that integrates domain adaptation (DA) with SBI to mitigate these cross-simulation shifts. Using synthetic galaxy images forward-modeled from the IllustrisTNG, EAGLE, and SIMBA simulations, we test two DA methods: Domain-Adversarial Neural Networks (DANN) and Maximum Mean Discrepancy (MMD). Incorporating DA significantly reduces bias and improves calibration, with MMD achieving the most stable performance, lowering the normalized residual metric,

, by an average of 31% and up to 57% when trained and tested on different simulations. Overall, HaloFlow

produces more robust, less biased with similar precision,

constraints than the standard approach using the stellar-to-halo mass relation. HaloFlow

enables consistent, simulation-trained inference models to generalize across domains, establishing a foundation for robust

inference from real HSC-SSP observations.

Paper Structure (31 sections, 27 equations, 7 figures, 6 tables)

This paper contains 31 sections, 27 equations, 7 figures, 6 tables.

Introduction
Data
Cosmological Hydrodynamical Simulations
Realistic Synthetic Images
Methods
HaloFlow
Domain Adaptation
Domain Adversarial Neural Networks
Maximum Mean Discrepancy (MMD)
HaloFlow$^{\rm DA}$: Domain-Adapted Posterior Inference
Results
Domain Shift Degrades Posterior Inference
Domain Adaptation Improves Generalization
Discussion
Sensitivity to the DA loss $\lambda$
...and 16 more sections

Figures (7)

Figure 1: HSC mock images of simulated galaxies from the z $\approx$ 0.1 snapshot of Eagle, Simba, TNG100, and TNG50 (left to right) simulations. 36 central galaxies with stellar masses $10 < \log(M_*/M_\odot) < 11.6$ are shown. Stellar masses increase from upper to lower rows. Individual panels are 120 kpc (63 arcsec) across. RGB colours derive from the HSC $gri$ optical bands using arcsinh scaling.
Figure 2: Schematic of the DA frameworks used in this work. Images of galaxies are preprocessed into input features, $\mathbf{X}$, which are passed through a feature extractor (green), and a regressor (blue) to predict $M*$ and $M_\mathrm{h}$ for each simulated galaxy. The dark gray panel (MMD) applies a MMD loss in the latent feature space, while the white panel (DANN) routes the extracted features through a gradient reversal layer to a domain-classifier network (red) that predicts which simulation the galaxy from and encourages domain-invariant features. The compressed features, $c \mathbf{X}$, used in HaloFlow$^{\rm DA}$ are taken from the feature-extractor (32-dimensional) for the DANN runs, and from the label-predictor output (2-dimensional, $(M_*, M_{\rm h})$) for the MMD runs.
Figure 3: UMAP showing the distribution of source (gray) and target (purple) simulations before DA (left) and after DA (right). Before DA the source and target distributions are highly discrepant. After DA the source and target distributions are indistinguishable suggesting that they are domain-invariant to each other.
Figure 4: Left: Inferred vs. true $M_\mathrm{h}$ using HaloFlow trained in-domain (blue; TNG$\Rightarrow$TNG) and out-of domain (red; TNG$\Rightarrow$Eagle). . Predictions are accurate and well-calibrated in the in-domain case, but show systematic bias under domain shift. For reference, we include $M_\mathrm{h}$ constraints using the standard approach based on the SHMR applied out-of-domain (gray). The standard approach also exhibits significant systematic bias. Right: Coverage plots for the HaloFlow posteriors. In-domain scenario shows near-ideal calibration (black dashed), while out-of-domain shows poorer calibration, indicating overconfident and unreliable posteriors.
Figure 5: Inferred vs. true halo masses across all train$\Rightarrow$test simulation pairs. Each panel corresponds to one combination (row: test sim, column: train sim). Scatter points show median inferred $M_h$ with 1$\sigma$ errorbars from: HaloFlow only (red), HaloFlow+DANN (blue), and HaloFlow+MMD (green). Diagonal panels show in-domain inferences with HaloFlow (gray) for reference. Off-diagonal panels reveal model misspecification for HaloFlow (red) and demonstrate that DANN and MMD (blue and green) improve the robustness of $M_\mathrm{h}$ inference across simulations.
...and 2 more figures

HaloFlow II: Robust Galaxy Halo Mass Inference with Domain Adaptation

Abstract

HaloFlow II: Robust Galaxy Halo Mass Inference with Domain Adaptation

Authors

Abstract

Table of Contents

Figures (7)