Table of Contents
Fetching ...

Cosmic Cartography II: completing galaxy catalogs for gravitational-wave cosmology

Konstantin Leyde, Tessa Baker, Wolfgang Enzi

TL;DR

This work tackles the bias in gravitational-wave cosmology caused by incomplete galaxy catalogs by introducing a forward-modeling framework that reconstructs the full galaxy field while jointly inferring the galaxy magnitude distribution and dark matter statistics. It models the galaxy rate as a product of an overall rate, a spatial distribution derived from a log-normal dark matter field, and a magnitude distribution modeled with a correlated Gaussian field, incorporating redshift uncertainties and a flexible detection probability. The method is validated on Millennium-simulation-based data, demonstrating accurate recovery of redshift, sky position, and magnitude distributions, as well as robust reconstruction of the line-of-sight priors used in dark siren analyses. The results show that completing the galaxy catalog in this probabilistic sense can yield more informative $H_0$ posteriors and set the stage for joint EM-GW analyses, with practical considerations and limitations discussed for future work.

Abstract

The dark siren method exploits the complementarity between gravitational-wave binary coalescence signals and galaxy catalogs originating from the same regions of space. However, all galaxy catalogs are incomplete, i.e. they only include a subset of all galaxies, typically being biased towards the bright end of the luminosity distribution. This sub-selection systematically affects the dark siren inference of the Hubble constant $H_0$, so a completeness relation has to be introduced that accounts for the missing objects. In the literature it is standard to assume that the missing galaxies are uniformly distributed across the sky and that the galaxy magnitude distribution is known. In this work we develop a novel method which improves upon these assumptions and reconstructs the underlying true galaxy field, respecting the spatial correlation of galaxies on large scales. In our method the true magnitude distribution of galaxies is inferred alongside the spatial galaxy distribution. Our method results in an improved three-dimensional prior in redshift and sky position for the host galaxy of a GW event, which is expected to make the resulting $H_0$ posterior more robust. Building on our previous work, we make a number of improvements, and validate our method on simulated data based on the Millennium simulation. The inference results can be reproduced through our publicly available code base light.

Cosmic Cartography II: completing galaxy catalogs for gravitational-wave cosmology

TL;DR

This work tackles the bias in gravitational-wave cosmology caused by incomplete galaxy catalogs by introducing a forward-modeling framework that reconstructs the full galaxy field while jointly inferring the galaxy magnitude distribution and dark matter statistics. It models the galaxy rate as a product of an overall rate, a spatial distribution derived from a log-normal dark matter field, and a magnitude distribution modeled with a correlated Gaussian field, incorporating redshift uncertainties and a flexible detection probability. The method is validated on Millennium-simulation-based data, demonstrating accurate recovery of redshift, sky position, and magnitude distributions, as well as robust reconstruction of the line-of-sight priors used in dark siren analyses. The results show that completing the galaxy catalog in this probabilistic sense can yield more informative posteriors and set the stage for joint EM-GW analyses, with practical considerations and limitations discussed for future work.

Abstract

The dark siren method exploits the complementarity between gravitational-wave binary coalescence signals and galaxy catalogs originating from the same regions of space. However, all galaxy catalogs are incomplete, i.e. they only include a subset of all galaxies, typically being biased towards the bright end of the luminosity distribution. This sub-selection systematically affects the dark siren inference of the Hubble constant , so a completeness relation has to be introduced that accounts for the missing objects. In the literature it is standard to assume that the missing galaxies are uniformly distributed across the sky and that the galaxy magnitude distribution is known. In this work we develop a novel method which improves upon these assumptions and reconstructs the underlying true galaxy field, respecting the spatial correlation of galaxies on large scales. In our method the true magnitude distribution of galaxies is inferred alongside the spatial galaxy distribution. Our method results in an improved three-dimensional prior in redshift and sky position for the host galaxy of a GW event, which is expected to make the resulting posterior more robust. Building on our previous work, we make a number of improvements, and validate our method on simulated data based on the Millennium simulation. The inference results can be reproduced through our publicly available code base light.

Paper Structure

This paper contains 36 sections, 27 equations, 20 figures, 2 tables.

Figures (20)

  • Figure 1: Toy model: schematic impact of the galaxy catalog completeness. Left panel: The redshift prior for one fiducial sky pixel, where different colors indicate different completeness. The catalogs transition to the homogeneous distribution at different redshifts: while the catalog of 10 galaxies (orange) reverts to the homogeneous distribution already at $z\sim0.2$, the catalog of 800 galaxies (blue) only transitions at $z\sim 1$. This has important consequences on the $H_0$ posterior as the right panel illustrates: while the most incomplete catalog provides an almost uninformative $H_0$ posterior (orange), the more complete catalogs with 200 and 800 galaxies (green and blue), respectively, lead to a more informative $H_0$ measurement.
  • Figure 2: Schematic overview of the building blocks of the forward model. All quantities that appear in boxes with dashed lines are fixed prior to the analysis (e.g. the parameters that govern the correlation structure of the magnitude distribution, cf. App. \ref{['app:magnitude_distribution']}). Note however that all remaining variables (including the magnitude distribution) are inferred jointly along with the DM power spectrum and density contrast.
  • Figure 3: Schematic overview of the hierarchical inference problem. Starting from the true catalog, one identifies possible galaxies (step Object detection). We indicate the apparent magnitude (cf. Eq. \ref{['eq:def apparent magnitude']}) of each galaxy with its plotted color (blue), as well as its size. Darker, larger circles represent brighter galaxy in apparent magnitude. In a second step (step Inference of galaxy properties) the parameters of each galaxy are inferred, independently for each object. Finally, the inference of the individual galaxy properties is improved by considering a joint prior on the overall distribution of galaxies (step Solve inverse problem). In this step, we start from the magnitude-limited catalog with the noisy galaxy properties and build possible configurations of the true catalog that are compatible with the observations. Let us stress here that uncertainties in the galaxy sky position are neglected throughout this analysis.
  • Figure 4: Comparison for two fiducial power spectra (left), one realization of the respective log-normal field, modeling the DM density contrast (center), and its underlying Gaussian random field (right). The latter two fields are related through Eq. \ref{['eq:lognormal_transformation']}. The top power spectrum is peaked at large $k$ (small scales), resulting in small-scale structures, and vice versa for the second power spectrum, resulting in large-scale correlations. Both power spectra were generated via Eq. \ref{['eq: def phenomenological power spectrum']}, with (1) $P_{A,\text{DM}} = 10^7$, $P_{n,1} = 1$, $P_{n,2} = 4$, $P_{k,\text{eq}} = 2 \times 10^{-2}$, and $P_{\xi} = 30$ and (2) $P_{A,\text{DM}} = 10^{12}$, $P_{n,1} = 3$, $P_{n,2} = 5$, $P_{k,\text{eq}} = 2 \times 10^{-3}$, and $P_{\xi} = 10$ for the power spectrum parameters. While the right-hand side is qualitatively different from full N-body simulations, the sampling from log-normal fields is computationally significantly less costly while ensuring large-scale correlations for the DM density.
  • Figure 5: The apparent magnitude-redshift histogram for the (observed) simulated catalog, with (left) a representative HEALPixel and the (right) number counts summed over the entire sky. To help guide the eye, we have added the apparent magnitude of a galaxy with fixed absolute magnitude (cf. Eq. \ref{['eq:def apparent magnitude']}) in cyan, assuming the cosmology that was used to generate the catalog. Indeed, bright galaxies transition from low $m$ (at near-by redshifts) to larger $m$ for higher redshift, where the exact evolution depends on the cosmology through the luminosity distance-redshift relation. It is clearly apparent when the catalog becomes incomplete---at values of $m\gtrapprox 20$ the number counts plummet, a direct consequence of the detection probability that we simulate in Eq. \ref{['eq:pdet modeling sigmoid']}. We stress that the galaxy number count is shown here (not galaxy density), resulting in a volumetric effect due to the increased volume of each pixel with increasing $z$. While the representative HEALPixel (left) is consistent with the overall structure of the distribution averaged over the sky (right), the Poisson noise is larger. Indeed, the smoothness of the right figure is a result of recovering spatial homogeneity on large scales, not recovered at the individual HEALPixel level, as expected.
  • ...and 15 more figures