Table of Contents
Fetching ...

Hierarchical Neural Simulation-Based Inference Over Event Ensembles

Lukas Heinrich, Siddharth Mishra-Sharma, Chris Pollard, Philipp Windischhofer

TL;DR

This work addresses inference for datasets composed of event ensembles governed by hierarchical forward models with global parameters $\\theta$ and local parameters $\\{z_i\\}$. It introduces dataset-wide, hierarchy-aware neural simulational inference methods that estimate either posterior distributions or likelihood ratios directly from simulations, using deep-set and transformer architectures to handle varying dataset cardinality and nuisance parameters. The methods are demonstrated across toy multivariate normal, particle-physics mixture models (frequentist and Bayesian settings), and an astrophysical strong-lensing example, showing tighter parameter constraints and substantial speedups over traditional techniques while preserving calibration. Overall, the approach enables scalable, amortized inference for complex hierarchical data in physics, astronomy, and related fields, with real-time updating capabilities and robust performance across diverse problem settings.

Abstract

When analyzing real-world data it is common to work with event ensembles, which comprise sets of observations that collectively constrain the parameters of an underlying model of interest. Such models often have a hierarchical structure, where "local" parameters impact individual events and "global" parameters influence the entire dataset. We introduce practical approaches for frequentist and Bayesian dataset-wide probabilistic inference in cases where the likelihood is intractable, but simulations can be realized via a hierarchical forward model. We construct neural estimators for the likelihood(-ratio) or posterior and show that explicitly accounting for the model's hierarchical structure can lead to significantly tighter parameter constraints. We ground our discussion using case studies from the physical sciences, focusing on examples from particle physics and cosmology.

Hierarchical Neural Simulation-Based Inference Over Event Ensembles

TL;DR

This work addresses inference for datasets composed of event ensembles governed by hierarchical forward models with global parameters and local parameters . It introduces dataset-wide, hierarchy-aware neural simulational inference methods that estimate either posterior distributions or likelihood ratios directly from simulations, using deep-set and transformer architectures to handle varying dataset cardinality and nuisance parameters. The methods are demonstrated across toy multivariate normal, particle-physics mixture models (frequentist and Bayesian settings), and an astrophysical strong-lensing example, showing tighter parameter constraints and substantial speedups over traditional techniques while preserving calibration. Overall, the approach enables scalable, amortized inference for complex hierarchical data in physics, astronomy, and related fields, with real-time updating capabilities and robust performance across diverse problem settings.

Abstract

When analyzing real-world data it is common to work with event ensembles, which comprise sets of observations that collectively constrain the parameters of an underlying model of interest. Such models often have a hierarchical structure, where "local" parameters impact individual events and "global" parameters influence the entire dataset. We introduce practical approaches for frequentist and Bayesian dataset-wide probabilistic inference in cases where the likelihood is intractable, but simulations can be realized via a hierarchical forward model. We construct neural estimators for the likelihood(-ratio) or posterior and show that explicitly accounting for the model's hierarchical structure can lead to significantly tighter parameter constraints. We ground our discussion using case studies from the physical sciences, focusing on examples from particle physics and cosmology.
Paper Structure (35 sections, 19 equations, 7 figures)

This paper contains 35 sections, 19 equations, 7 figures.

Figures (7)

  • Figure 1: Schematic illustration of the deep set-based architecture used in this work. The red lines/arrows show the path used only in the frequentist setting for training a global test-statistic estimator while profiling over global nuisance parameters.
  • Figure 2: The median and middle-68% containment of the inferred posterior width over 500 test samples as a function of the size of the dataset. Results for the deep sets (left) and transformer (middle) architectures are shown. The expected scaling (dashed lines) is observed in all cases, but the deep set architecture shows a narrower spread in outcomes. The right plot shows the evolution of a posterior for a specific sequence using the deep set model, illustrating convergence of the posterior mass around the true point.
  • Figure 3: (Left) True profile likelihood ratio for the mixture model scenario in Sec. \ref{['sec:profiling']} (bottom) compared to the neural test statistic (top). (Center) Neural test statistic and profile likelihood ratio scatter plot, showing bijective relationship between the two. (Right) Maximum-likelihood estimate obtained from the neural test statistic $\hat{\theta}_\mathrm{ML}$, compared with the true value $\hat{\theta}_\mathrm{pLR}$.
  • Figure 4: Signal- and background densities $p_s$ and $p_b$ in a simplified particle physics analysis for two different true values of the nuisance parameter $\theta_\nu$ (left), additionally showing sample datasets $\{x\}$ of size $N_0=100$. The median mean $\mu_{\theta}$ (center) and median standard deviation $\sigma_{\theta}$ (right) of the posterior $p(\theta\mid\{x\})$, obtained with different inference methods for different true nuisance parameter values $\theta_{\nu, \mathrm{true}}$. The median is computed on ensembles of 400 datasets for each parameter point.
  • Figure 5: Illustrative samples from the lensing model. The rows show two different choices of local (per-event) parameters, while the different columns show variations on the global (set-wide) parameters for the fixed choice of local parameters. Sample-to-sample variation induced by the global parameters, which control the abundance of a subhalo population in the lens, can be seen. The scatter points shows the location of individual subhalos in each image.
  • ...and 2 more figures