Table of Contents
Fetching ...

Higgs Production Classifier using Weak Supervision

Kai-Feng Chen, Yi-An Chen, Cheng-Wei Chiang, Feng-Yang Hsieh

TL;DR

The paper tackles distinguishing Higgs production mechanisms, specifically vector-b boson fusion (VBF) and gluon-gluon fusion (GGF), in hadron collider data without event-level truth labels. It implements the Classification Without Labels (CWoLa) weakly supervised framework using real diphoton data, comparing convolutional neural networks (CNN) and a Particle Transformer (ParT) on image and set representations, respectively. A physics-motivated azimuthal augmentation ($\phi$-shifting) is introduced to boost training statistics, and the authors test decay-channel transferability across $H \to \gamma\gamma$, $H \to ZZ\to 4\ell$, and $H \to Z\gamma\to 2\ell\gamma$, observing strong cross-channel generalization especially when decay products are not affected by strong QCD. The results show that hadronic activity provides most discriminative power, that augmentation improves performance at low data regimes, and that decay-agnostic classifiers can be effectively reused across channels, enabling data-efficient Higgs analyses at the LHC.

Abstract

A reliable determination of the Higgs production mechanism in hadron collider experiments is essential in the program of the measurements of the Higgs couplings. We employ weak supervision, CWoLa in particular, to train deep neural networks using real data of the diphoton events, in the hope of reducing biases resulting from Monte Carlo simulations. Models based on the convolutional neural network and the transformer are tested and compared. In particular, the classification performance gets slightly better when the photon information is removed from training on the low-luminosity region of $H \to γγ$. We explicitly show that the performance can be improved when the training dataset is enlarged by data augmentation using physics-motivated methods. We further demonstrate that the trained model can be successfully applied to the $H \to ZZ$ and $H \to Zγ$ events, showing that such classifiers are agnostic to Higgs decay modes provided they do not involve strong QCD corrections.

Higgs Production Classifier using Weak Supervision

TL;DR

The paper tackles distinguishing Higgs production mechanisms, specifically vector-b boson fusion (VBF) and gluon-gluon fusion (GGF), in hadron collider data without event-level truth labels. It implements the Classification Without Labels (CWoLa) weakly supervised framework using real diphoton data, comparing convolutional neural networks (CNN) and a Particle Transformer (ParT) on image and set representations, respectively. A physics-motivated azimuthal augmentation (-shifting) is introduced to boost training statistics, and the authors test decay-channel transferability across , , and , observing strong cross-channel generalization especially when decay products are not affected by strong QCD. The results show that hadronic activity provides most discriminative power, that augmentation improves performance at low data regimes, and that decay-agnostic classifiers can be effectively reused across channels, enabling data-efficient Higgs analyses at the LHC.

Abstract

A reliable determination of the Higgs production mechanism in hadron collider experiments is essential in the program of the measurements of the Higgs couplings. We employ weak supervision, CWoLa in particular, to train deep neural networks using real data of the diphoton events, in the hope of reducing biases resulting from Monte Carlo simulations. Models based on the convolutional neural network and the transformer are tested and compared. In particular, the classification performance gets slightly better when the photon information is removed from training on the low-luminosity region of . We explicitly show that the performance can be improved when the training dataset is enlarged by data augmentation using physics-motivated methods. We further demonstrate that the trained model can be successfully applied to the and events, showing that such classifiers are agnostic to Higgs decay modes provided they do not involve strong QCD corrections.

Paper Structure

This paper contains 21 sections, 3 equations, 7 figures, 2 tables.

Figures (7)

  • Figure 1: Distribution of jet flavor compositions at $\mathcal{L} = 3000~\mathrm{fb}^{-1}$ for $H\to\gamma\gamma$ and $H\to ZZ\to 4\ell$. The categories 2q0g, 1q1g, and 0q2g correspond to events with two quark-initiated jets, one quark and one gluon jet, and two gluon-initiated jets, respectively. The SR is defined by the 2q0g category, while the CR includes both 1q1g and 0q2g events.
  • Figure 2: The GGF processes with $H\to\gamma\gamma$ in the image-based representation. Each event image consists of three channels corresponding to tower, track, and decay-product information. Note that tower and track channels may also contain constituents from the Higgs decay products.
  • Figure 3: Test AUC of CNN and ParT models trained on $H \to \gamma\gamma$ events. Results are shown (a) with and (b) without decay-product information. Fully supervised (FS) baselines are included for reference. Each point represents the mean AUC over ten training seeds, and the shaded bands indicate one standard deviation.
  • Figure 4: Effect of $\phi$-shifting augmentation on $H \to \gamma\gamma$ classification performance. Each point represents the mean AUC over ten training seeds, with shaded bands indicating one standard deviation.
  • Figure 5: Test AUC of CNN and ParT models trained on $H \to ZZ \to 4\ell$ events, evaluated (a) with and (b) without decay-product information. The "FS" stands for fully supervised learning. Each point represents the mean AUC over ten training seeds, and the shaded bands indicate one standard deviation.
  • ...and 2 more figures