The Halo Occupation Distribution: Towards an Empirical Determination of the Relation Between Galaxies and Mass
Andreas A. Berlind, David H. Weinberg
TL;DR
The paper develops the Halo Occupation Distribution (HOD) framework as a complete description of galaxy bias, encapsulating the probability $P(N|M)$ that a halo of mass $M$ hosts $N$ galaxies and the intra-halo spatial/velocity relations. Through N-body simulations and varied HOD prescriptions (power-law and broken power-law $N_{ ext{avg}}(M)$, different $P(N|N_{ ext{avg}})$, central galaxies, and halo biases), it analyzes how a suite of clustering statistics—primarily the two-point function $\xi_g(r)$, but also the galaxy–mass cross-correlation $\xi_{gm}(r)$, the bispectrum $Q$, void probability function, pairwise velocity dispersion, and redshift-space distortions—respond to different aspects of the HOD. The authors demonstrate that achieving the observed power-law form of $\xi_g(r)$ requires finely balanced HOD parameters, and that many statistics provide complementary constraints that can, in principle, empirically determine the full HOD from redshift surveys. They outline a practical strategy for measuring the HOD from data (e.g., 2dF/SDSS) and discuss how combining clustering with lensing and dynamical mass estimates can break degeneracies with cosmology, enabling tests of galaxy formation physics and sharpening cosmological inferences.
Abstract
We investigate galaxy bias in the framework of the ``Halo Occupation Distribution'' (HOD), which defines the bias of a population of galaxies by the conditional probability P(N|M) that a dark matter halo of virial mass M contains N galaxies, together with prescriptions that specify the relative spatial and velocity distributions of galaxies and dark matter within halos. By populating the halos of a cosmological N-body simulation using a variety of HOD models, we examine the sensitivity of different galaxy clustering statistics to properties of the HOD. The galaxy correlation function responds to different aspects of P(N|M) on different scales. Obtaining the observed power-law form of xi(r) requires rather specific combinations of HOD parameters, implying a strong constraint on the physics of galaxy formation; the success of numerical and semi-analytic models in reproducing this form is entirely non-trivial. Other clustering statistics such as the galaxy-mass correlation function, the bispectrum, the void probability function, the pairwise velocity dispersion, and the group multiplicity function are sensitive to different combinations of HOD parameters and thus provide complementary information about galaxy bias. We outline a strategy for determining the HOD empirically from redshift survey data. This method starts from an assumed cosmological model, but we argue that cosmological and HOD parameters will have non-degenerate effects on galaxy clustering, so that a substantially incorrect cosmological model will not reproduce the observations for any choice of HOD. Empirical determinations of the HOD as a function of galaxy type from the 2dF and SDSS redshift surveys will provide a detailed target for theories of galaxy formation, insight into the origin of galaxy properties, and sharper tests of cosmological models.
