Table of Contents
Fetching ...

Explainable Equivariant Neural Networks for Particle Physics: PELICAN

Alexander Bogatskiy, Timothy Hoffman, David W. Miller, Jan T. Offermann, Xiaoyang Liu

TL;DR

This work presents a comprehensive study of the PELICAN algorithm architecture in the context of both tagging (classification) and reconstructing (regression) Lorentz-boosted top quarks, including the difficult task of specifically identifying and measuring the W-boson inside the dense environment of the Lorentz-boosted top-quark hadronic final state.

Abstract

PELICAN is a novel permutation equivariant and Lorentz invariant or covariant aggregator network designed to overcome common limitations found in architectures applied to particle physics problems. Compared to many approaches that use non-specialized architectures that neglect underlying physics principles and require very large numbers of parameters, PELICAN employs a fundamentally symmetry group-based architecture that demonstrates benefits in terms of reduced complexity, increased interpretability, and raw performance. We present a comprehensive study of the PELICAN algorithm architecture in the context of both tagging (classification) and reconstructing (regression) Lorentz-boosted top quarks, including the difficult task of specifically identifying and measuring the $W$-boson inside the dense environment of the Lorentz-boosted top-quark hadronic final state. We also extend the application of PELICAN to the tasks of identifying quark-initiated vs.~gluon-initiated jets, and a multi-class identification across five separate target categories of jets. When tested on the standard task of Lorentz-boosted top-quark tagging, PELICAN outperforms existing competitors with much lower model complexity and high sample efficiency. On the less common and more complex task of 4-momentum regression, PELICAN also outperforms hand-crafted, non-machine learning algorithms. We discuss the implications of symmetry-restricted architectures for the wider field of machine learning for physics.

Explainable Equivariant Neural Networks for Particle Physics: PELICAN

TL;DR

This work presents a comprehensive study of the PELICAN algorithm architecture in the context of both tagging (classification) and reconstructing (regression) Lorentz-boosted top quarks, including the difficult task of specifically identifying and measuring the W-boson inside the dense environment of the Lorentz-boosted top-quark hadronic final state.

Abstract

PELICAN is a novel permutation equivariant and Lorentz invariant or covariant aggregator network designed to overcome common limitations found in architectures applied to particle physics problems. Compared to many approaches that use non-specialized architectures that neglect underlying physics principles and require very large numbers of parameters, PELICAN employs a fundamentally symmetry group-based architecture that demonstrates benefits in terms of reduced complexity, increased interpretability, and raw performance. We present a comprehensive study of the PELICAN algorithm architecture in the context of both tagging (classification) and reconstructing (regression) Lorentz-boosted top quarks, including the difficult task of specifically identifying and measuring the -boson inside the dense environment of the Lorentz-boosted top-quark hadronic final state. We also extend the application of PELICAN to the tasks of identifying quark-initiated vs.~gluon-initiated jets, and a multi-class identification across five separate target categories of jets. When tested on the standard task of Lorentz-boosted top-quark tagging, PELICAN outperforms existing competitors with much lower model complexity and high sample efficiency. On the less common and more complex task of 4-momentum regression, PELICAN also outperforms hand-crafted, non-machine learning algorithms. We discuss the implications of symmetry-restricted architectures for the wider field of machine learning for physics.
Paper Structure (65 sections, 1 theorem, 38 equations, 34 figures, 12 tables)

This paper contains 65 sections, 1 theorem, 38 equations, 34 figures, 12 tables.

Key Result

Theorem B.1

If an IRC-safe Lorentz-invariant observable with a mixture of massless and massive inputs is real analytic in the pairwise dot products $d_{ij}$ near the origin, then it can depend on the massless inputs only through their sum. If all inputs are massless, this observable can be reduced to an analyti

Figures (34)

  • Figure 1: The 15 binary arrays of rank $4$ that represent the basis elements of the permutation equivariant aggregators of PELICAN.
  • Figure 2: The PELICAN equivariant block updating square arrays.
  • Figure 3: Performance of various ML architectures represented by the background rejection as a function of the signal efficiency.
  • Figure 4: Comparison of top-tagger background rejection performance for fixed signal efficiency ($\epsilon_{S} = 0.3$) as a function of the number of parameters in each model considered, combining data from table \ref{['tab1']} and table \ref{['tab_pelican_model_size']}.
  • Figure 5: Stacked histogram with proportional bin heights showing the mass spectrum of the two targets, the true $W$ momentum $p^W_{\mathrm{true}}$, and the contained true $W$ momentum $p^W_{\mathrm{cont}}$. The left figure shows a stacked histogram of the true $W$ mass spectrum comprised of the FC events (purple) and non-FC events (blue). The right figure shows a stacked histogram of mass spectrum for the contained $W$ mass, similarly comprised of both FC events (purple) and non-FC events (blue). In each case, the bin contents are scaled linearly relative to the total number of events, i.e. the fraction of FC events in a given bin is given by the apparent height of the FC curve divided by the total height of the bin (heights are measured from the $x$-axis). The two mass spectra of FC events, in fact, match.
  • ...and 29 more figures

Theorems & Definitions (1)

  • Theorem B.1