Table of Contents
Fetching ...

Optimal Equivariant Architectures from the Symmetries of Matrix-Element Likelihoods

Daniel Maître, Vishal S. Ngairangbam, Michael Spannowsky

TL;DR

A novel approach is presented that combines MEM-inspired symmetry considerations with equivariant neural network design for particle physics analysis and opens new directions for physics-informed architecture design, promising more powerful tools for probing physics beyond the Standard Model.

Abstract

The Matrix-Element Method (MEM) has long been a cornerstone of data analysis in high-energy physics. It leverages theoretical knowledge of parton-level processes and symmetries to evaluate the likelihood of observed events. In parallel, the advent of geometric deep learning has enabled neural network architectures that incorporate known symmetries directly into their design, leading to more efficient learning. This paper presents a novel approach that combines MEM-inspired symmetry considerations with equivariant neural network design for particle physics analysis. Even though Lorentz invariance and permutation invariance overall reconstructed objects are the largest and most natural symmetry in the input domain, we find that they are sub-optimal in most practical search scenarios. We propose a longitudinal boost-equivariant message-passing neural network architecture that preserves relevant discrete symmetries. We present numerical studies demonstrating MEM-inspired architectures achieve new state-of-the-art performance in distinguishing di-Higgs decays to four bottom quarks from the QCD background, with enhanced sample and parameter efficiencies. This synergy between MEM and equivariant deep learning opens new directions for physics-informed architecture design, promising more powerful tools for probing physics beyond the Standard Model.

Optimal Equivariant Architectures from the Symmetries of Matrix-Element Likelihoods

TL;DR

A novel approach is presented that combines MEM-inspired symmetry considerations with equivariant neural network design for particle physics analysis and opens new directions for physics-informed architecture design, promising more powerful tools for probing physics beyond the Standard Model.

Abstract

The Matrix-Element Method (MEM) has long been a cornerstone of data analysis in high-energy physics. It leverages theoretical knowledge of parton-level processes and symmetries to evaluate the likelihood of observed events. In parallel, the advent of geometric deep learning has enabled neural network architectures that incorporate known symmetries directly into their design, leading to more efficient learning. This paper presents a novel approach that combines MEM-inspired symmetry considerations with equivariant neural network design for particle physics analysis. Even though Lorentz invariance and permutation invariance overall reconstructed objects are the largest and most natural symmetry in the input domain, we find that they are sub-optimal in most practical search scenarios. We propose a longitudinal boost-equivariant message-passing neural network architecture that preserves relevant discrete symmetries. We present numerical studies demonstrating MEM-inspired architectures achieve new state-of-the-art performance in distinguishing di-Higgs decays to four bottom quarks from the QCD background, with enhanced sample and parameter efficiencies. This synergy between MEM and equivariant deep learning opens new directions for physics-informed architecture design, promising more powerful tools for probing physics beyond the Standard Model.

Paper Structure

This paper contains 17 sections, 18 equations, 6 figures, 3 tables.

Figures (6)

  • Figure 1: Representing the hierarchy in group invariant function approximation where a larger group ($S_4$) imposes additional constraints on the weights compared to a proper subgroup ($S_2\times S_2$). Although the constraints of $S_2\times S_2$ can become those of $S_4$, the stronger constraints of $S_4$ cannot become a function that is $S_2\times S_2$ invariant but not $S_4$ invariant, as its weights lie strictly outside the red ellipse with the constraint $w_1=w_2\neq w_3=w_4$. Therefore, even though $S_4$ contains the group $S_2\times S_2$, an $S_4$-invariant function cannot become a purely $S_2\times S_2$-invariant function. This holds for general group invariant functions due to the structure of fibres induced by invariance in the function's domain (see fig. \ref{['fig:inv_fibre']}).
  • Figure 2: The left shows some possible partitions of a bounded domain $\mathcal{D}$ in 2D out of infinitely many possibilities. On the top right, the yellow rectangle is a saturated set in $\mathsf{P}_1$ while the blue ellipse is not. Consequently, if one restricts the smallest possible fibres that a function approximator can have to be those in $\mathsf{P}_1$, it can accommodate a target function with $\mathsf{P}_2$ (bottom left of figure on the right) as its fibres since all partitions in $\mathsf{P}_2$ are saturated under $\mathsf{P}_1$. However, if it had $\mathsf{P}_3$, no amount of function approximation on $\mathsf{P}_1$ will agree over the whole domain $\mathcal{D}$ since all of its fibres are unsaturated in $\mathsf{P}_1$. Note that for incompatibility, one fibre being unsaturated is sufficient for incorrectness of $\mathsf{P}_1$.
  • Figure 3: The difference between the smallest fibres (the set of points in the domain where the function's value is equal) of a function invariant under a group $\mathcal{G}_1$ and its proper subgroup $\mathcal{G}_2$. The smaller squares denote the coarsening of the domain with similar colours, signifying equality of the function's value. $\mathcal{G}_1$-invariance assumes larger fibres from the start. In contrast, $\mathcal{G}_2$-invariance assumes smaller and compatible partitions with $\mathcal{G}_1$-invariance, i.e. they can become enlarged so that the function becomes equal on the smallest fibres of $\mathcal{G}_1$-invariant functions.
  • Figure 4: A Lorentz boost in the direction opposite to the leading jet in a three-jet event (on the left) will transform it into a two-jet event (on the right). For a baseline selection criteria which allows for more than two jets in the final state, the MEM-likelihood evaluated as a sum of three jet final state processes and two jet final state processes will not be the same for either event, making the event likelihood violate Lorentz invariance.
  • Figure 5: Median AUCs for the resonant signal for all architectures and training sizes. The shaded region shows the range, and the error bars denote the lower and upper quartiles.
  • ...and 1 more figures