Variational Pseudo Marginal Methods for Jet Reconstruction in Particle Physics
Hanming Yang, Antonio Khalil Moretti, Sebastian Macaluso, Philippe Chlenski, Christian A. Naesseth, Itsik Pe'er
TL;DR
This paper tackles jet reconstruction by treating jet histories as latent binary trees and addressing the intractable combinatorial space of possible topologies. It introduces a combinatorial sequential Monte Carlo framework (CSMC) tailored to jets, and builds two variational approaches (VCSMC and VNCSMC) to learn both local tree structures and a global decay parameter $\lambda$, further unifying inference with a variational pseudo-marginal framework. The methodology yields unbiased marginal-likelihood estimators and scalable variational objectives, achieving superior speed and accuracy compared with baseline clustering methods on data generated by the Ginkgo jet generator. The work demonstrates a principled, fully Bayesian treatment of hierarchical jet reconstruction with potential for broad impact in collider data analyses and probabilistic modeling of combinatorial structures in physics. Overall, the approach offers a scalable, uncertainty-aware pathway to learn jet topologies and parameters, enabling more precise physics inferences and model calibration.
Abstract
Reconstructing jets, which provide vital insights into the properties and histories of subatomic particles produced in high-energy collisions, is a main problem in data analyses in collider physics. This intricate task deals with estimating the latent structure of a jet (binary tree) and involves parameters such as particle energy, momentum, and types. While Bayesian methods offer a natural approach for handling uncertainty and leveraging prior knowledge, they face significant challenges due to the super-exponential growth of potential jet topologies as the number of observed particles increases. To address this, we introduce a Combinatorial Sequential Monte Carlo approach for inferring jet latent structures. As a second contribution, we leverage the resulting estimator to develop a variational inference algorithm for parameter learning. Building on this, we introduce a variational family using a pseudo-marginal framework for a fully Bayesian treatment of all variables, unifying the generative model with the inference process. We illustrate our method's effectiveness through experiments using data generated with a collider physics generative model, highlighting superior speed and accuracy across a range of tasks.
