Table of Contents
Fetching ...

Estimating cluster masses: a comparative study between machine learning and maximum likelihood

Raeed Mundow, Adi Nusser

TL;DR

This paper tackles estimating cluster virial masses $M_v$ from the surrounding galaxy distribution without member identification. It compares a physics-informed maximum-likelihood estimator (MLE) that relies on universal, scaled CAH profiles with a data-driven convolutional autoencoder CNN (AE-CNN) trained on MDPL2 mock catalogs. The AE-CNN achieves lower scatter in $\log M_v$ than the MLE, notably with redshift-space input ($0.10$ dex vs $0.16$ dex) and with velocity-based inputs ($0.12$ dex and $0.16$ dex), even in the presence of inhomogeneous Malmquist bias. The results illustrate a trade-off between interpretability and flexibility, showing that the AE-CNN effectively learns the posterior mean from data while the MLE remains transparent under its universal-profile assumptions.

Abstract

We compare an autoencoder convolutional neural network (AE-CNN) with a conventional maximum-likelihood estimator (MLE) for inferring cluster virial masses, $M_v$, directly from the galaxy distribution around clusters, without identifying members or interlopers. The AE-CNN is trained on mock galaxy catalogues, whereas the MLE assumes that clusters of similar mass share the same phase-space galaxy profile. Conceptually, the MLE returns an unbiased estimate of $\log M_v$ at fixed true mass, whereas the AE-CNN approximates the posterior mean, so the true $\log M_v$ is unbiased at fixed estimate. Using MDPL2 mock clusters with redshift space number density as input, the AE-CNN attains an rms scatter of $0.10\,\textrm{dex}$ between predicted and true $\log M_v$, compared with $0.16\,\textrm{dex}$ for the MLE. With inputs based on mean peculiar velocities, binned in redshift space or observed distance, the AE-CNN achieves scatters of $0.12\,\textrm{dex}$ and $0.16\,\textrm{dex}$, respectively, despite strong inhomogeneous Malmquist bias.

Estimating cluster masses: a comparative study between machine learning and maximum likelihood

TL;DR

This paper tackles estimating cluster virial masses from the surrounding galaxy distribution without member identification. It compares a physics-informed maximum-likelihood estimator (MLE) that relies on universal, scaled CAH profiles with a data-driven convolutional autoencoder CNN (AE-CNN) trained on MDPL2 mock catalogs. The AE-CNN achieves lower scatter in than the MLE, notably with redshift-space input ( dex vs dex) and with velocity-based inputs ( dex and dex), even in the presence of inhomogeneous Malmquist bias. The results illustrate a trade-off between interpretability and flexibility, showing that the AE-CNN effectively learns the posterior mean from data while the MLE remains transparent under its universal-profile assumptions.

Abstract

We compare an autoencoder convolutional neural network (AE-CNN) with a conventional maximum-likelihood estimator (MLE) for inferring cluster virial masses, , directly from the galaxy distribution around clusters, without identifying members or interlopers. The AE-CNN is trained on mock galaxy catalogues, whereas the MLE assumes that clusters of similar mass share the same phase-space galaxy profile. Conceptually, the MLE returns an unbiased estimate of at fixed true mass, whereas the AE-CNN approximates the posterior mean, so the true is unbiased at fixed estimate. Using MDPL2 mock clusters with redshift space number density as input, the AE-CNN attains an rms scatter of between predicted and true , compared with for the MLE. With inputs based on mean peculiar velocities, binned in redshift space or observed distance, the AE-CNN achieves scatters of and , respectively, despite strong inhomogeneous Malmquist bias.

Paper Structure

This paper contains 11 sections, 22 equations, 7 figures, 1 table.

Figures (7)

  • Figure 1: Top panel: Normalized number density profiles of CAHs of virial masses $m_\mathrm{v}>10^{10}\, h^{-1} \, {\rm M}_\odot$, around clusters, for three cluster mass bins, as indicated in the figure ($M_\mathrm{v14}\equiv M_\mathrm{v}/10^{14}\, h^{-1} \, {\rm M}_\odot$). Solid curves represent stacked profiles, while dashed lines are examples of individual clusters, demonstrating deviations from the universal pattern. Bottom panel: Profiles separated by mass of CAHs $m_\mathrm{v}$ (in § units of $10^{11}\, h^{-1} \, {\rm M}_\odot$), for all clusters. The differences do not affect the MLE, since the same $m_\mathrm{v}$ cut is adopted.
  • Figure 2: Top panels: Moments of radial (outward from the cluster halo center) peculiar velocity of CAHs with $m_\mathrm{v}>10^{10}\, h^{-1} \, {\rm M}_\odot$ normalized by $H_0r_\mathrm{v}$. In the left panel, stacked profiles of the mean radial velocity are plotted for three cluster mass bins. In the panel to the right, the curves of the standard deviation of the radial velocity dispersion are shown. Bottom panels: Same quantities as the top panels, but showing dependence of profiles on the mass of CAHs in all clusters used. For reference, by \ref{['eq:virdef']}, the circular velocity is $V_\mathrm{c}\approx 0.14 r_\mathrm{v}$ at $z=0$.
  • Figure 3: Illustration of inhomogeneous Malmquist bias. Mean line-of-sight velocity profiles CAHs for a simulated cluster with $M_{\mathrm{v}} = 5\times 10^{14}\,h^{-1}\,\textrm{M}_\odot$. The line-of-sight passes through the cluster center, with the observer located to the left. The red curve corresponds to true velocity binned by true distance, $d^\text{true}$; the orange curve represents the observed velocity, $v^\text{obs}$, binned by observed distance, $d^\text{obs}$; and the blue curve is the mean $v^\text{obs}$ as a function of the redshift space coordinate $s$. The cluster center is positioned at $d^\text{true} = 50\,h^{-1}\,\mathrm{Mpc}$, and a relative distance error of $20\%$ is assumed in computing $d^\text{obs}$ and $v^\text{obs}$.
  • Figure 4: Illustration of the Auto-encoder architecture. The numbers inside the blue rectangles are the number of filters in each layer while the numbers outside represent the grids dimension as it passes through the network.
  • Figure 5: Training and validation mean squared error (MSE) versus training epoch, for the three neural network models. Solid, thicker lines denote the training MSE, while thinner lines indicate the validation MSE. Vertical dashed lines mark the epochs at which the best-performing models are saved.
  • ...and 2 more figures