Table of Contents
Fetching ...

Learnable cut flow for high energy physics

Jing Li, Hao Sun

TL;DR

<1> The paper addresses the interpretability gap of neural networks in high-energy physics by introducing Learnable Cut Flow (LCF), a differentiable, data-driven approach that emulates traditional cut flows. <2> LCF incorporates learnable cuts and Learnable Importance, plus two training strategies (parallel and sequential) to balance interpretability with performance, using mask-based losses to preserve data shape. <3> Across six synthetic mocks and a real diboson versus QCD dataset, LCF learns meaningful cut boundaries, identifies discriminative features, handles redundancy/correlation, and demonstrates robustness while remaining lightweight to train. <4> The work offers actionable, model-driven insights into feature importance and cut choices, with potential extensions toward optimizing significance and automating feature ordering for fully interpretable, automated analyses.

Abstract

Neural networks have emerged as a powerful paradigm for tasks in high energy physics, yet their opaque training process renders them as a black box. In contrast, the traditional cut flow method offers simplicity and interpretability but requires extensive manual tuning to identify optimal cut boundaries. To merge the strengths of both approaches, we propose the Learnable Cut Flow (LCF), a neural network that transforms the traditional cut selection into a fully differentiable, data-driven process. LCF implements two cut strategies-parallel, where observable distributions are treated independently, and sequential, where prior cuts shape subsequent ones-to flexibly determine optimal boundaries. Building on this strategy, we introduce the Learnable Importance, a metric that quantifies feature importance and adjusts their contributions to the loss accordingly, offering model-driven insights unlike ad-hoc metrics. To ensure differentiability, a modified loss function replaces hard cuts with mask operations, preserving data shape throughout the training process. LCF is tested on six varied mock datasets and a realistic diboson vs. QCD dataset. Results demonstrate that LCF 1. accurately learns cut boundaries across typical feature distributions in both parallel and sequential strategies, 2. assigns higher importance to discriminative features with minimal overlap, 3. handles redundant or correlated features robustly, and 4. performs effectively in real-world scenarios. In the diboson dataset, LCF initially underperforms boosted decision trees and multiplayer perceptrons when using all observables. LCF bridges the gap between traditional cut flow method and modern black-box neural networks, delivering actionable insights into the training process and feature importance. Source code and experimental data are available at https://github.com/Star9daisy/learnable-cut-flow.

Learnable cut flow for high energy physics

TL;DR

<1> The paper addresses the interpretability gap of neural networks in high-energy physics by introducing Learnable Cut Flow (LCF), a differentiable, data-driven approach that emulates traditional cut flows. <2> LCF incorporates learnable cuts and Learnable Importance, plus two training strategies (parallel and sequential) to balance interpretability with performance, using mask-based losses to preserve data shape. <3> Across six synthetic mocks and a real diboson versus QCD dataset, LCF learns meaningful cut boundaries, identifies discriminative features, handles redundancy/correlation, and demonstrates robustness while remaining lightweight to train. <4> The work offers actionable, model-driven insights into feature importance and cut choices, with potential extensions toward optimizing significance and automating feature ordering for fully interpretable, automated analyses.

Abstract

Neural networks have emerged as a powerful paradigm for tasks in high energy physics, yet their opaque training process renders them as a black box. In contrast, the traditional cut flow method offers simplicity and interpretability but requires extensive manual tuning to identify optimal cut boundaries. To merge the strengths of both approaches, we propose the Learnable Cut Flow (LCF), a neural network that transforms the traditional cut selection into a fully differentiable, data-driven process. LCF implements two cut strategies-parallel, where observable distributions are treated independently, and sequential, where prior cuts shape subsequent ones-to flexibly determine optimal boundaries. Building on this strategy, we introduce the Learnable Importance, a metric that quantifies feature importance and adjusts their contributions to the loss accordingly, offering model-driven insights unlike ad-hoc metrics. To ensure differentiability, a modified loss function replaces hard cuts with mask operations, preserving data shape throughout the training process. LCF is tested on six varied mock datasets and a realistic diboson vs. QCD dataset. Results demonstrate that LCF 1. accurately learns cut boundaries across typical feature distributions in both parallel and sequential strategies, 2. assigns higher importance to discriminative features with minimal overlap, 3. handles redundant or correlated features robustly, and 4. performs effectively in real-world scenarios. In the diboson dataset, LCF initially underperforms boosted decision trees and multiplayer perceptrons when using all observables. LCF bridges the gap between traditional cut flow method and modern black-box neural networks, delivering actionable insights into the training process and feature importance. Source code and experimental data are available at https://github.com/Star9daisy/learnable-cut-flow.

Paper Structure

This paper contains 17 sections, 36 equations, 26 figures, 8 tables.

Figures (26)

  • Figure 1: The structure of the learnable cut flow model.
  • Figure 2: Mock feature distributions from $x_1$ to $x_{10}$. $x_1$, $x_2$, $x_3$, and $x_4$ represent four basic cases of signal relative location. $x_5$ and $x_6$ have weaker and stronger separation compared with $x_1$. $x_7$ and $x_8$ are two redundant features. $x_9$ and $x_{10}$ are two features highly-correlated with $x_1$. The center value divides a distribution into two parts to train two sets of trainable weights to form one learned cut.
  • Figure 3: Feature correlations of mock datasets. Mock4 dataset contains two features $x_9$ and $x_{10}$ highly-correlated to $x_1$.
  • Figure 4: Distributions of jet substructure variables in the real diboson vs. QCD dataset. The center value divides a distribution into two parts to train two sets of trainable weights to form one learned cut.
  • Figure 5: Observable correlations of the diboson dataset.
  • ...and 21 more figures