Learnable cut flow for high energy physics

Jing Li; Hao Sun

Learnable cut flow for high energy physics

Jing Li, Hao Sun

TL;DR

<1> The paper addresses the interpretability gap of neural networks in high-energy physics by introducing Learnable Cut Flow (LCF), a differentiable, data-driven approach that emulates traditional cut flows. <2> LCF incorporates learnable cuts and Learnable Importance, plus two training strategies (parallel and sequential) to balance interpretability with performance, using mask-based losses to preserve data shape. <3> Across six synthetic mocks and a real diboson versus QCD dataset, LCF learns meaningful cut boundaries, identifies discriminative features, handles redundancy/correlation, and demonstrates robustness while remaining lightweight to train. <4> The work offers actionable, model-driven insights into feature importance and cut choices, with potential extensions toward optimizing significance and automating feature ordering for fully interpretable, automated analyses.

Abstract

Neural networks have emerged as a powerful paradigm for tasks in high energy physics, yet their opaque training process renders them as a black box. In contrast, the traditional cut flow method offers simplicity and interpretability but requires extensive manual tuning to identify optimal cut boundaries. To merge the strengths of both approaches, we propose the Learnable Cut Flow (LCF), a neural network that transforms the traditional cut selection into a fully differentiable, data-driven process. LCF implements two cut strategies-parallel, where observable distributions are treated independently, and sequential, where prior cuts shape subsequent ones-to flexibly determine optimal boundaries. Building on this strategy, we introduce the Learnable Importance, a metric that quantifies feature importance and adjusts their contributions to the loss accordingly, offering model-driven insights unlike ad-hoc metrics. To ensure differentiability, a modified loss function replaces hard cuts with mask operations, preserving data shape throughout the training process. LCF is tested on six varied mock datasets and a realistic diboson vs. QCD dataset. Results demonstrate that LCF 1. accurately learns cut boundaries across typical feature distributions in both parallel and sequential strategies, 2. assigns higher importance to discriminative features with minimal overlap, 3. handles redundant or correlated features robustly, and 4. performs effectively in real-world scenarios. In the diboson dataset, LCF initially underperforms boosted decision trees and multiplayer perceptrons when using all observables. LCF bridges the gap between traditional cut flow method and modern black-box neural networks, delivering actionable insights into the training process and feature importance. Source code and experimental data are available at https://github.com/Star9daisy/learnable-cut-flow.

Learnable cut flow for high energy physics

TL;DR

Abstract

Learnable cut flow for high energy physics

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (26)