Explainable Equivariant Neural Networks for Particle Physics: PELICAN

Alexander Bogatskiy; Timothy Hoffman; David W. Miller; Jan T. Offermann; Xiaoyang Liu

Explainable Equivariant Neural Networks for Particle Physics: PELICAN

Alexander Bogatskiy, Timothy Hoffman, David W. Miller, Jan T. Offermann, Xiaoyang Liu

TL;DR

This work presents a comprehensive study of the PELICAN algorithm architecture in the context of both tagging (classification) and reconstructing (regression) Lorentz-boosted top quarks, including the difficult task of specifically identifying and measuring the W-boson inside the dense environment of the Lorentz-boosted top-quark hadronic final state.

Abstract

PELICAN is a novel permutation equivariant and Lorentz invariant or covariant aggregator network designed to overcome common limitations found in architectures applied to particle physics problems. Compared to many approaches that use non-specialized architectures that neglect underlying physics principles and require very large numbers of parameters, PELICAN employs a fundamentally symmetry group-based architecture that demonstrates benefits in terms of reduced complexity, increased interpretability, and raw performance. We present a comprehensive study of the PELICAN algorithm architecture in the context of both tagging (classification) and reconstructing (regression) Lorentz-boosted top quarks, including the difficult task of specifically identifying and measuring the $W$-boson inside the dense environment of the Lorentz-boosted top-quark hadronic final state. We also extend the application of PELICAN to the tasks of identifying quark-initiated vs.~gluon-initiated jets, and a multi-class identification across five separate target categories of jets. When tested on the standard task of Lorentz-boosted top-quark tagging, PELICAN outperforms existing competitors with much lower model complexity and high sample efficiency. On the less common and more complex task of 4-momentum regression, PELICAN also outperforms hand-crafted, non-machine learning algorithms. We discuss the implications of symmetry-restricted architectures for the wider field of machine learning for physics.

Explainable Equivariant Neural Networks for Particle Physics: PELICAN

TL;DR

Abstract

-boson inside the dense environment of the Lorentz-boosted top-quark hadronic final state. We also extend the application of PELICAN to the tasks of identifying quark-initiated vs.~gluon-initiated jets, and a multi-class identification across five separate target categories of jets. When tested on the standard task of Lorentz-boosted top-quark tagging, PELICAN outperforms existing competitors with much lower model complexity and high sample efficiency. On the less common and more complex task of 4-momentum regression, PELICAN also outperforms hand-crafted, non-machine learning algorithms. We discuss the implications of symmetry-restricted architectures for the wider field of machine learning for physics.

Paper Structure (65 sections, 1 theorem, 38 equations, 34 figures, 12 tables)

This paper contains 65 sections, 1 theorem, 38 equations, 34 figures, 12 tables.

Introduction
Top-tagging with a PELICAN classifier
Quark-vs-gluon-initiated jet tagging with a PELICAN classifier
Multi-class jet tagging with a PELICAN classifier
$W$-boson 4-momentum reconstruction with PELICAN
$W$-boson mass reconstruction with PELICAN
Explaining PELICAN 4-momentum reconstruction
IRC-safety and PELICAN
Equivariance and jet physics
Lorentz symmetry and jets
Lorentz invariance
Permutation equivariance
Elementary equivariant aggregators
Equivariance and Jet Physics
PELICAN architecture
...and 50 more sections

Key Result

Theorem B.1

If an IRC-safe Lorentz-invariant observable with a mixture of massless and massive inputs is real analytic in the pairwise dot products $d_{ij}$ near the origin, then it can depend on the massless inputs only through their sum. If all inputs are massless, this observable can be reduced to an analyti

Figures (34)

Figure 1: The 15 binary arrays of rank $4$ that represent the basis elements of the permutation equivariant aggregators of PELICAN.
Figure 2: The PELICAN equivariant block updating square arrays.
Figure 3: Performance of various ML architectures represented by the background rejection as a function of the signal efficiency.
Figure 4: Comparison of top-tagger background rejection performance for fixed signal efficiency ($\epsilon_{S} = 0.3$) as a function of the number of parameters in each model considered, combining data from table \ref{['tab1']} and table \ref{['tab_pelican_model_size']}.
Figure 5: Stacked histogram with proportional bin heights showing the mass spectrum of the two targets, the true $W$ momentum $p^W_{\mathrm{true}}$, and the contained true $W$ momentum $p^W_{\mathrm{cont}}$. The left figure shows a stacked histogram of the true $W$ mass spectrum comprised of the FC events (purple) and non-FC events (blue). The right figure shows a stacked histogram of mass spectrum for the contained $W$ mass, similarly comprised of both FC events (purple) and non-FC events (blue). In each case, the bin contents are scaled linearly relative to the total number of events, i.e. the fraction of FC events in a given bin is given by the apparent height of the FC curve divided by the total height of the bin (heights are measured from the $x$-axis). The two mass spectra of FC events, in fact, match.
...and 29 more figures

Theorems & Definitions (1)

Theorem B.1

Explainable Equivariant Neural Networks for Particle Physics: PELICAN

TL;DR

Abstract

Explainable Equivariant Neural Networks for Particle Physics: PELICAN

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (34)

Theorems & Definitions (1)