Energy Flow Networks: Deep Sets for Particle Jets
Patrick T. Komiske, Eric M. Metodiev, Jesse Thaler
TL;DR
This paper addresses learning from collider events treated as variable-length, unordered sets of particles by introducing Energy Flow Networks (EFN) and Particle Flow Networks (PFN) based on the Deep Sets paradigm. It formalizes observables as $\mathcal{O}(\{p_i\}) = F\Big(\sum_i \Phi(p_i)\Big)$ and extends this to IRC-safe observables via $\mathcal{O}(\{p_i\}) = F\Big(\sum_i z_i\,\Phi(\hat{p}_i)\Big)$, unifying detector images and radiation moments within a single framework. The authors demonstrate competitive quark/gluon jet discrimination using EFN/PFN, reveal interpretable latent-space visualizations that reflect QCD’s collinear structure, and extract closed-form observables (e.g., $A_{r_0}$, $B_{r_1,\beta}$, and $C(A,B)$) from trained models, bridging learned representations and analytic physics. These methods offer scalable, permutation-invariant tools for a broad range of LHC analyses, with the potential for extensions to pileup mitigation and event-level learning. Overall, the work provides a principled, interpretable approach to set-based learning in high-energy physics that preserves theoretical properties while enabling practical gains in performance and insight.
Abstract
A key question for machine learning approaches in particle physics is how to best represent and learn from collider events. As an event is intrinsically a variable-length unordered set of particles, we build upon recent machine learning efforts to learn directly from sets of features or "point clouds". Adapting and specializing the "Deep Sets" framework to particle physics, we introduce Energy Flow Networks, which respect infrared and collinear safety by construction. We also develop Particle Flow Networks, which allow for general energy dependence and the inclusion of additional particle-level information such as charge and flavor. These networks feature a per-particle internal (latent) representation, and summing over all particles yields an overall event-level latent representation. We show how this latent space decomposition unifies existing event representations based on detector images and radiation moments. To demonstrate the power and simplicity of this set-based approach, we apply these networks to the collider task of discriminating quark jets from gluon jets, finding similar or improved performance compared to existing methods. We also show how the learned event representation can be directly visualized, providing insight into the inner workings of the model. These architectures lend themselves to efficiently processing and analyzing events for a wide variety of tasks at the Large Hadron Collider. Implementations and examples of our architectures are available online in our EnergyFlow package.
