Machine-Learning-Inspired SMEFT Simplified Template Cross Sections: A Case Study in ZH Production

Daniel Conde; Miguel G. Folgado; Veronica Sanz

Machine-Learning-Inspired SMEFT Simplified Template Cross Sections: A Case Study in ZH Production

Daniel Conde, Miguel G. Folgado, Veronica Sanz

Abstract

The Simplified Template Cross Section (STXS) program has become the standard interface between Higgs measurements and global fits, but its fixed one-dimensional boundaries are not guaranteed to align with the phase-space directions to which the Standard Model Effective Field Theory (SMEFT) is most sensitive. We propose a machine-learning-inspired extension of STXS in which supervised classifiers are used only at the design stage to identify simple, publishable phase-space boundaries. Using associated Higgs production, $pp \to ZH$, as a case study and a benchmark momentum-dependent bosonic SMEFT deformation, we show that the relevant signal-background separation is well captured by a linear boundary in the $(p_T^Z,mZH)$ plane. We construct such boundaries with a linear support vector machine and with a deep-neural-network-assisted distillation procedure, and compare them directly with the standard STXS $p_T^Z$ bins through a common single-region Asimov-significance analysis. In this proof-of-concept setup, the ML-inspired regions systematically outperform the corresponding STXS regions, with the largest gains appearing in the boosted regime where SMEFT effects are concentrated. The final observable remains a simple linear cut, preserving the transparency and experimental portability that make STXS useful.

Machine-Learning-Inspired SMEFT Simplified Template Cross Sections: A Case Study in ZH Production

Abstract

, as a case study and a benchmark momentum-dependent bosonic SMEFT deformation, we show that the relevant signal-background separation is well captured by a linear boundary in the

plane. We construct such boundaries with a linear support vector machine and with a deep-neural-network-assisted distillation procedure, and compare them directly with the standard STXS

bins through a common single-region Asimov-significance analysis. In this proof-of-concept setup, the ML-inspired regions systematically outperform the corresponding STXS regions, with the largest gains appearing in the boosted regime where SMEFT effects are concentrated. The final observable remains a simple linear cut, preserving the transparency and experimental portability that make STXS useful.

Paper Structure (21 sections, 18 equations, 9 figures, 7 tables)

This paper contains 21 sections, 18 equations, 9 figures, 7 tables.

Introduction
Physics target and STXS baseline
SMEFT origin of the boosted deformation
Why STXS is the right baseline
Simulation, observables, and statistical setup
Event generation and benchmark definition
Event selection and observable set
Training samples and feature preprocessing
Single-region statistical metric
Machine-learning-inspired region construction
Linear SVM in the (pTZ,mZH) plane
Significance-driven refinement of the linear boundary
DNN-guided distillation into a linear region
Results
Comparison of the learned boundaries
...and 6 more sections

Figures (9)

Figure 1: Asimov significance in the slope--intercept plane for the scan around the SVM solution in the highest-$p_T^Z$ slice, for a representative choice $\epsilon=0.5$.
Figure 2: Classifier performance inside the four official STXS $p_T^Z$ slices. Left: ROC curves for SM--benchmark discrimination with a shallow classifier trained separately in each slice. Right: background efficiency versus signal efficiency. The increasing separability toward large $p_T^Z$ motivates the focus on the boosted region.
Figure 3: Asimov significance as a function of the DNN-score threshold used to define the projected high-score region in the highest-$p_T^Z$ slice.
Figure 4: Significance landscape for the post-fit slope--intercept scan applied to the DNN-distilled boundary in the highest-$p_T^Z$ slice, for $\epsilon=0.5$.
Figure 5: Comparison of the official STXS boundary with the ML-inspired linear boundaries in the highest-$p_T^Z$ slice, shown for the representative background-uncertainty choices used in the analysis. The ML-guided lines select the correlated high-$p_T^Z$, high-$m_{ZH}$ region more efficiently than the one-dimensional STXS slicing.
...and 4 more figures

Machine-Learning-Inspired SMEFT Simplified Template Cross Sections: A Case Study in ZH Production

Abstract

Machine-Learning-Inspired SMEFT Simplified Template Cross Sections: A Case Study in ZH Production

Authors

Abstract

Table of Contents

Figures (9)