Table of Contents
Fetching ...

SEAL - A Symmetry EncourAging Loss for High Energy Physics

Pradyun Hebbar, Thandikire Madula, Vinicius Mikuni, Benjamin Nachman, Nadav Outmezguine, Inbar Savoray

TL;DR

SEAL introduces soft symmetry constraints to encourage Lorentz invariance in neural networks without altering architecture. It defines two penalties, the group-level $\Gamma_G$ based on random Lorentz boosts and the infinitesimal $\Gamma_{\delta}$ based on infinitesimal generators, and integrates them into the training loss with a weight $\lambda$. In toy and jet-tagging experiments, SEAL improves invariance and extrapolation to unseen kinematic regions, while maintaining or improving performance, illustrating a flexible, data-efficient alternative to strictly equivariant networks. The approach is broadly applicable to other transformation groups and can complement symmetry-aware modeling in high-energy physics and beyond.

Abstract

Physical symmetries provide a strong inductive bias for constructing functions to analyze data. In particular, this bias may improve robustness, data efficiency, and interpretability of machine learning models. However, building machine learning models that explicitly respect symmetries can be difficult due to the dedicated components required. Moreover, real-world experiments may not exactly respect fundamental symmetries at the level of finite granularities and energy thresholds. In this work, we explore an alternative approach to create symmetry-aware machine learning models. We introduce soft constraints that allow the model to decide the importance of added symmetries during the learning process instead of enforcing exact symmetries. We investigate two complementary approaches, one that penalizes the model based on specific transformations of the inputs and one inspired by group theory and infinitesimal transformations of the inputs. Using top quark jet tagging and Lorentz equivariance as examples, we observe that the addition of the soft constraints leads to more robust performance while requiring negligible changes to current state-of-the-art models.

SEAL - A Symmetry EncourAging Loss for High Energy Physics

TL;DR

SEAL introduces soft symmetry constraints to encourage Lorentz invariance in neural networks without altering architecture. It defines two penalties, the group-level based on random Lorentz boosts and the infinitesimal based on infinitesimal generators, and integrates them into the training loss with a weight . In toy and jet-tagging experiments, SEAL improves invariance and extrapolation to unseen kinematic regions, while maintaining or improving performance, illustrating a flexible, data-efficient alternative to strictly equivariant networks. The approach is broadly applicable to other transformation groups and can complement symmetry-aware modeling in high-energy physics and beyond.

Abstract

Physical symmetries provide a strong inductive bias for constructing functions to analyze data. In particular, this bias may improve robustness, data efficiency, and interpretability of machine learning models. However, building machine learning models that explicitly respect symmetries can be difficult due to the dedicated components required. Moreover, real-world experiments may not exactly respect fundamental symmetries at the level of finite granularities and energy thresholds. In this work, we explore an alternative approach to create symmetry-aware machine learning models. We introduce soft constraints that allow the model to decide the importance of added symmetries during the learning process instead of enforcing exact symmetries. We investigate two complementary approaches, one that penalizes the model based on specific transformations of the inputs and one inspired by group theory and infinitesimal transformations of the inputs. Using top quark jet tagging and Lorentz equivariance as examples, we observe that the addition of the soft constraints leads to more robust performance while requiring negligible changes to current state-of-the-art models.

Paper Structure

This paper contains 12 sections, 10 equations, 6 figures, 2 tables.

Figures (6)

  • Figure 1: The MSE score as a function of the boost applied to the training data. In Fig. \ref{['fig:symm_exact']} the symmetry is exact, while in Fig. \ref{['fig:symm_broken']} the symmetry is broken by a small spurion $s = \left(0,0,0,10^{-3}\right)$ . The differential symmetry penalty $\delta$SEAL is shown in solid lines, and the group-sample penalty loss GSEAL is shown in dashed lines.
  • Figure 2: Tagger invariance to 3D Lorentz boosts as a function of the boost parameter evaluated on the ATLAS Top Tagging dataset.
  • Figure 3: Balanced accuracy as a function of the original jet transverse momentum. Circular markers represent models evaluated on the original test dataset. Diamonds show the balanced accuracy evaluated on the boosted test data set. Also shown is the performance of PELICAN.
  • Figure 4: Tagger accuracy as a function of jet $p_T$ for the baseline model and soft penalty model. The vertical line represents the $p_T$ cut applied during training, values to the left were seen during training, values to the right were unseen. The SEAL weights are chosen to be $\lambda = 1.0$ both for models trained with $\Gamma_G$ and for models trained with $\Gamma_\delta$.
  • Figure 5: Tagger background rejection at a signal acceptance of $0.3$ a function of jet $p_T$ for the baseline model and soft penalty model. The vertical line represents the $p_T$ cut applied during training, values to the left were seen during training, values to the right were unseen.
  • ...and 1 more figures