A case study of sending graph neural networks back to the test bench for applications in high-energy particle physics
Emanuel Pfeffer, Michael Waßmer, Yee-Ying Cung, Roger Wolf, Ulrich Husemann
TL;DR
This study benchmarks graph neural networks against fully-connected neural networks for classifying the underlying ttbb-related processes (ttbb, ttH(bb), ttZ(bb)) in LHC-like events, under a carefully controlled setup that fixes hyperparameters and information exposure. The authors show that GNNs do not inherently outperform DNNs without relational information; incorporating physics-m informed edge weights and multiple GraphConv layers allows GNNs to recover or surpass DNN performance, particularly when the models have comparable numbers of trainable parameters. The results highlight that the gain from GNNs arises from access to hierarchical neighbor information encoded by graph convolutions, rather than mere graph structure. The findings suggest that GNNs can offer advantages in tasks involving hierarchically structured, multi-object final states, with implications for jet physics and event classification at the LHC. Overall, the work provides a principled framework to assess GNNs in high-energy physics and clarifies when relational inductive biases are beneficial.
Abstract
In high-energy particle collisions, the primary collision products usually decay further resulting in tree-like, hierarchical structures with a priori unknown multiplicity. At the stable-particle level all decay products of a collision form permutation invariant sets of final state objects. The analogy to mathematical graphs gives rise to the idea that graph neural networks (GNNs), which naturally resemble these properties, should be best-suited to address many tasks related to high-energy particle physics. In this paper we describe a benchmark test of a typical GNN against neural networks of the well-established deep fully-connected feed-forward architecture. We aim at performing this comparison maximally unbiased in terms of nodes, hidden layers, or trainable parameters of the neural networks under study. As physics case we use the classification of the final state X produced in association with top quark-antiquark pairs in proton-proton collisions at the Large Hadron Collider at CERN, where X stands for a bottom quark-antiquark pair produced either non-resonantly or through the decay of an intermediately produced Z or Higgs boson.
