Table of Contents
Fetching ...

Accelerated calibration of semi-analytic galaxy formation models

Andrew Robertson, Andrew Benson

TL;DR

The paper tackles the computational bottleneck of calibrating semi-analytic galaxy formation models by introducing a fast SHMR-based likelihood evaluated at a few target halo masses. Implemented with Galacticus and differential-evolution MCMC, it yields a good match to the low-redshift SMF and extends the approach to higher-redshift SHMR and the stellar mass–size relation, while highlighting tensions that motivate greater physical flexibility in the cooling, recycling, and feedback schemes. The findings show that the accelerated calibration is effective for rapid model screening but cannot fully reconcile all datasets simultaneously, particularly SHMR evolution across redshift and the mass–size relation. The authors argue that this framework is complementary to emulator-based inference, offering a practical two-stage workflow: first explore model variants quickly, then apply detailed emulation to perform thorough posterior inference on the most promising models.

Abstract

We present an accelerated calibration framework for semi-analytic galaxy formation models, demonstrated with Galacticus. Rather than fitting directly to properties such as the low-redshift stellar mass function (SMF) - which requires evolving thousands of halos per likelihood evaluation - we construct a fast likelihood from the stellar-to-halo mass relation (SHMR; mean and scatter) evaluated at a small set of target halo masses, reducing each evaluation to simulating only tens of galaxies. We sample the posterior over Galacticus parameters with Markov Chain Monte Carlo and show that the resulting calibration reproduces the low-redshift SMF. We then extend the method to additional datasets, using a higher-redshift SHMR and the low-redshift stellar mass-size relation as examples, and assess performance for large scale structure survey-relevant properties: stellar masses, sizes, and emission-line strengths. The SMF matches data well at low redshift, but toward higher redshift the model yields too few massive galaxies and too many low-mass galaxies. Size evolution with redshift is approximately correct, but the mass-size relation is too flat, producing massive galaxies that are too small. The H$α$ luminosity function is well reproduced at z~2, but by z~0.4 the model overproduces highly star-forming, H$α$-bright systems. These discrepancies suggest the model lacks sufficient flexibility (e.g. in gas cooling/recycling or feedback) to reconcile all datasets simultaneously. Our strategy complements emulator-based methods for calibrating semi-analytic models by enabling rapid, low-cost scans of model choices and parameterisations - a capability we envision leveraging to supply calibrated starting points for more detailed follow-up inference.

Accelerated calibration of semi-analytic galaxy formation models

TL;DR

The paper tackles the computational bottleneck of calibrating semi-analytic galaxy formation models by introducing a fast SHMR-based likelihood evaluated at a few target halo masses. Implemented with Galacticus and differential-evolution MCMC, it yields a good match to the low-redshift SMF and extends the approach to higher-redshift SHMR and the stellar mass–size relation, while highlighting tensions that motivate greater physical flexibility in the cooling, recycling, and feedback schemes. The findings show that the accelerated calibration is effective for rapid model screening but cannot fully reconcile all datasets simultaneously, particularly SHMR evolution across redshift and the mass–size relation. The authors argue that this framework is complementary to emulator-based inference, offering a practical two-stage workflow: first explore model variants quickly, then apply detailed emulation to perform thorough posterior inference on the most promising models.

Abstract

We present an accelerated calibration framework for semi-analytic galaxy formation models, demonstrated with Galacticus. Rather than fitting directly to properties such as the low-redshift stellar mass function (SMF) - which requires evolving thousands of halos per likelihood evaluation - we construct a fast likelihood from the stellar-to-halo mass relation (SHMR; mean and scatter) evaluated at a small set of target halo masses, reducing each evaluation to simulating only tens of galaxies. We sample the posterior over Galacticus parameters with Markov Chain Monte Carlo and show that the resulting calibration reproduces the low-redshift SMF. We then extend the method to additional datasets, using a higher-redshift SHMR and the low-redshift stellar mass-size relation as examples, and assess performance for large scale structure survey-relevant properties: stellar masses, sizes, and emission-line strengths. The SMF matches data well at low redshift, but toward higher redshift the model yields too few massive galaxies and too many low-mass galaxies. Size evolution with redshift is approximately correct, but the mass-size relation is too flat, producing massive galaxies that are too small. The H luminosity function is well reproduced at z~2, but by z~0.4 the model overproduces highly star-forming, H-bright systems. These discrepancies suggest the model lacks sufficient flexibility (e.g. in gas cooling/recycling or feedback) to reconcile all datasets simultaneously. Our strategy complements emulator-based methods for calibrating semi-analytic models by enabling rapid, low-cost scans of model choices and parameterisations - a capability we envision leveraging to supply calibrated starting points for more detailed follow-up inference.

Paper Structure

This paper contains 36 sections, 16 equations, 10 figures, 1 table.

Figures (10)

  • Figure 1: A corner plot showing the posterior distribution for the Galacticus parameters that we varied, when fitting only to the SHMR (and its scatter) at $z \approx 0.3$. The panels in the top-right show the target data from 2012ApJ...744..159L, as well as the corresponding model data vector for the maximum-likelihood model from our MCMC.
  • Figure 2: The low-redshift SHMR (left) and stellar mass function (right) from a Galacticus model calibrated only to the low-$z$ SHMR in three mass bins (see Section \ref{['sect:results_lowzSHMRcalibration']}). The mass bins used for calibration are marked by the grey vertical lines in the left panel. The solid lines show the Galacticus results for our MAP parameters, while the shaded regions mark the 16th--84th percentiles, evaluated by running Galacticus with parameters randomly sampled from the MCMC chains.
  • Figure 3: A corner plot showing the posterior distribution for the Galacticus parameters that we varied, when fitting to the SHMR (and its scatter) at $z \approx 0.3$ and $z \approx 0.9$, as well as to the low-redshift stellar mass--size relation. This posterior is labelled "full calibration", with the posterior distribution from Fig. \ref{['fig:SHMRcorner']} also shown for comparison here, and labelled "low-$z$ SHMR". The rightmost panels are similar to the inset panel in Fig. \ref{['fig:SHMRcorner']}, for both the low and high redshift SHMR. The $M_\star$--$R_\star^{50}$ panel shows the relationship between galaxy stellar mass and galaxy half-light radius for the maximum-likelihood model from our MCMC chains. The median target relations for both early-type and late-type galaxies and the associated lognormal scatter are shown by the lines with shaded regions 2003MNRAS.343..978S. The dots mark the locations of 27 Galacticus galaxies, with the colour denoting whether they have a $B/T>0.5$ (which we associated with early-type galaxies), or $B/T \leq 0.5$ (which we associated with late-type galaxies). These 27 galaxies are the same 9 halos per mass bin over three mass bins as used for the low-$z$ SHMR.
  • Figure 4: Low-$z$ (left) and high-$z$ (right) SHMRs for the Galacticus model calibrated to the low-$z$ and high-$z$ SHMR and the low-$z$ stellar mass--size relation (see Section \ref{['sect:includingAdditionalConstraints']}). The mass bins used for the SHMR calibration are marked by the grey vertical lines in each panel. The solid lines show the Galacticus results for our MAP parameters, while the shaded regions mark the 16th--84th percentiles, obtained by running Galacticus with parameters randomly drawn from the MCMC chains. Despite being calibrated to the high-$z$ data, the model does not reproduce a sufficiently steep SHMR at high-$z$ to match the 2012ApJ...744..159L measurements.
  • Figure 5: Tensions encountered when fitting multiple datasets. Black points with error bars show the 2012ApJ...744..159L means and scatters in $\log M_\star$ at $z=0.29$ (left) and $z=0.88$ (right) used in the likelihood. Coloured symbols show the best-fitting models when calibrating to different dataset combinations (legend). Points are offset slightly in $M_{200\mathrm{m}}$ for clarity; coloured lines connect the unshifted model points to aid comparison.
  • ...and 5 more figures