Table of Contents
Fetching ...

Forecasting Dark Matter Subhalo Constraints from Stellar Streams using Implicit Likelihood Inference

Tri Nguyen, Rutong Pei, Zhuofu Li, Nora Shipp, Scott Dodelson, Denis Erkal, Peter S. Ferguson, Tjitske K. Starkenburg, Markus M. Rau, Alexander H. Riley, Alan Junzhe Zhou, the LSST Dark Energy Science Collaboration

Abstract

The evidence for dark matter (DM) remains compelling, although attempts to understand its particle nature remain inconclusive. One promising method to study DM is detecting DM subhalos through their gravitational interactions with stellar streams. In this study, we apply Neural Posterior Estimation (NPE) to constrain subhalo interaction parameters, including mass, scale radius, velocity, and encounter geometry, from stellar stream kinematics. We generate particle spray simulations based on the Lagrange Cloud stripping technique, focusing on the ATLAS-Aliqa Uma stream as a test case. We train multiple NPE models across multiple observational scenarios, quantifying how kinematic completeness affects inference and forecasting constraints from upcoming surveys including LSST, 4MOST, and 10-year Gaia data. Our results demonstrate that NPE can produce accurate and well-calibrated posteriors. In the idealized case with full 6D coordinates, we achieve subhalo mass uncertainties of 15-20% for a $10^7 \, \mathrm{M_\odot}$ subhalo, with 5D coordinates (excluding radial velocities) achieving similar performance. Under realistic observational conditions, mass uncertainties range from 50% (present-day) to 20-40% (future scenarios), with comparable performance between the photometric-only LSST sample and a smaller sample that includes Gaia proper motions and 4MOST radial velocities. Most notably, we find that velocity bimodality emerges when phase space is poorly sampled, whether due to missing kinematic information or limited stellar tracers. Combining large photometric samples with targeted spectroscopic follow-up can effectively resolves this degeneracy. These results demonstrate the power of implicit likelihood inference for optimizing stellar stream observational strategies and forecasting DM subhalo constraints from upcoming surveys.

Forecasting Dark Matter Subhalo Constraints from Stellar Streams using Implicit Likelihood Inference

Abstract

The evidence for dark matter (DM) remains compelling, although attempts to understand its particle nature remain inconclusive. One promising method to study DM is detecting DM subhalos through their gravitational interactions with stellar streams. In this study, we apply Neural Posterior Estimation (NPE) to constrain subhalo interaction parameters, including mass, scale radius, velocity, and encounter geometry, from stellar stream kinematics. We generate particle spray simulations based on the Lagrange Cloud stripping technique, focusing on the ATLAS-Aliqa Uma stream as a test case. We train multiple NPE models across multiple observational scenarios, quantifying how kinematic completeness affects inference and forecasting constraints from upcoming surveys including LSST, 4MOST, and 10-year Gaia data. Our results demonstrate that NPE can produce accurate and well-calibrated posteriors. In the idealized case with full 6D coordinates, we achieve subhalo mass uncertainties of 15-20% for a subhalo, with 5D coordinates (excluding radial velocities) achieving similar performance. Under realistic observational conditions, mass uncertainties range from 50% (present-day) to 20-40% (future scenarios), with comparable performance between the photometric-only LSST sample and a smaller sample that includes Gaia proper motions and 4MOST radial velocities. Most notably, we find that velocity bimodality emerges when phase space is poorly sampled, whether due to missing kinematic information or limited stellar tracers. Combining large photometric samples with targeted spectroscopic follow-up can effectively resolves this degeneracy. These results demonstrate the power of implicit likelihood inference for optimizing stellar stream observational strategies and forecasting DM subhalo constraints from upcoming surveys.

Paper Structure

This paper contains 23 sections, 8 equations, 10 figures, 3 tables.

Figures (10)

  • Figure 1: Geometry of the subhalo-stellar stream encounter. The impact time $T_\mathrm{a}$ is not illustrated. Adapted from Figure 1 of hilmi24.
  • Figure 2: Example perturbed streams. Blue data points denote the raw particle data output by the simulation, while the black data points with error bars denote the preprocessed data. The top left panel shows the mock AAU stream, while the rest of the panels show example streams from the training dataset. In each panel, each row shows a different observable as a function of the stream longitude $\phi_1$: from top to bottom, the stream latitudes $\phi_2$, radial velocities $v_r$, proper motions $\mu_1$ and $\mu_2$, and distances $d$. For clarity, for the velocities and distances, the differences relative to an unperturbed stream are shown, where the unperturbed stream baseline is estimated by fitting the coordinates and velocities using a fourth-order polynomial following hilmi24.
  • Figure 3: The flowchart of the NPE framework. The training data $\vec{x}$ is created by sampling the subhalo interaction parameters $\vec{\theta}$ from a prior distribution and passing them into the simulator, which implicitly encodes the likelihood function $p(\vec{x} \mid \vec{\theta})$. Data preprocessing steps are assumed to be incorporated within the simulator for simplicity. During training, the NPE model $\hat{q}_\phi(\vec{\theta} \mid \vec{x})$, which consists of a Transformer encoder and a flow density estimator, minimizes the objective $\mathcal{L}_\mathrm{NLL}$ (Equation \ref{['eq:loss_nll']}). Once trained, during inference, given an observed stream $\vec{x}_0$, the posterior $\hat{q}_\phi(\vec{\theta} \mid \vec{x}_0)$ can be directly sampled without re-training.
  • Figure 4: Comparison between the MAP estimators and the true parameters for the four observable sets. Each panel shows the MAP estimators versus the true parameters for different subhalo parameters. In each panel, streams are grouped into bins based on their true parameters. Data points and error bars show the mean and 68% confidence interval of the MAP estimators within each bin, with different markers and colors corresponding to different observable sets. The black dashed lines show the one-to-one correspondence indicating perfect recovery.
  • Figure 5: Credibility level versus expected coverage from TARP for the four observable sets. Each solid line and shaded band show the median and 68% confidence interval of the TARP result across 100 bootstrap samples of a different observable set (color). Bands are too narrow to be visible at this scale. The black dashed diagonal line represents the ideal case where posteriors are perfectly calibrated. The black S-shaped curves indicate under-confident and over-confident posteriors, respectively.
  • ...and 5 more figures