Table of Contents
Fetching ...

Loss function to optimise signal significance in particle physics

Jai Bardhan, Cyrin Neeraj, Subhadip Mitra, Tanumoy Mandal

TL;DR

This work introduces a surrogate loss to directly maximise collider signal significance, defined as $Z \approx N_s/\sqrt{N_b}$, by formulating a submodular set function $\Delta_Z$ and applying the Lovász extension to obtain a convex surrogate $\bar{\Delta}_Z$ suitable for gradient-based training. The authors prove submodularity of the $Z$-score-based loss and demonstrate, in a toy two-background classification with a linear model, that models trained with $\bar{\Delta}_Z$ achieve higher signal efficiency at comparable $Z$ values than those trained with binarised cross-entropy. The results indicate potential gains in collider search sensitivity, with the approach naturally incorporating process cross sections into training. Code for the loss is publicly available, enabling broader experimentation and extension to more realistic settings with deep models and multiple backgrounds.

Abstract

We construct a surrogate loss to directly optimise the significance metric used in particle physics. We evaluate our loss function for a simple event classification task using a linear model and show that it produces decision boundaries that change according to the cross sections of the processes involved. We find that the models trained with the new loss have higher signal efficiency for similar values of estimated signal significance compared to ones trained with a cross-entropy loss, showing promise to improve sensitivity of particle physics searches at colliders.

Loss function to optimise signal significance in particle physics

TL;DR

This work introduces a surrogate loss to directly maximise collider signal significance, defined as , by formulating a submodular set function and applying the Lovász extension to obtain a convex surrogate suitable for gradient-based training. The authors prove submodularity of the -score-based loss and demonstrate, in a toy two-background classification with a linear model, that models trained with achieve higher signal efficiency at comparable values than those trained with binarised cross-entropy. The results indicate potential gains in collider search sensitivity, with the approach naturally incorporating process cross sections into training. Code for the loss is publicly available, enabling broader experimentation and extension to more realistic settings with deep models and multiple backgrounds.

Abstract

We construct a surrogate loss to directly optimise the significance metric used in particle physics. We evaluate our loss function for a simple event classification task using a linear model and show that it produces decision boundaries that change according to the cross sections of the processes involved. We find that the models trained with the new loss have higher signal efficiency for similar values of estimated signal significance compared to ones trained with a cross-entropy loss, showing promise to improve sensitivity of particle physics searches at colliders.

Paper Structure

This paper contains 15 sections, 18 equations, 4 figures, 1 algorithm.

Figures (4)

  • Figure 1: Decision boundaries of the linear classifier when trained with the $\bar{\Delta}_Z$ loss with hinge error for (a) Case 1 and (b) Case 2 (see Section \ref{['sec:3']}). When trained with $\bar{\Delta}_Z$, the classifier prioritises reducing the background with the larger cross section.
  • Figure 3: Loss landscapes for the four error measures $\mathbf{m}$ in Sec.§ \ref{['sec:LZerr']}. The $Z$ score loss is plotted with ground truth, $GT = [1, 0]$, $\sigma = [1, 10]$. The $x, y$ axes denote the classifier output ($F_1(x), F_2(x)$). For the Hinge Error, the GT labels are converted to their signed equivalent.
  • Figure 4: (a) ROC Curve for dataset (total) background efficiency vs signal efficiency. (b) ROC Curve for true background efficiency vs signal efficiency. The true background efficiency differs from the total background efficiency in that it accounts for the cross sections of the background processes. We observe that our loss performs better at removing background at a higher signal efficiency.
  • Figure : Gradient of Lovász $Z$ loss $\Bar{\Delta}_Z$