Table of Contents
Fetching ...

Multivariate discrimination and the Higgs + W/Z search

Kevin Black, Jason Gallicchio, John Huth, Michael Kagan, Matthew D. Schwartz, Brock Tweedie

TL;DR

The paper develops a rigorous, systematic framework for optimizing multivariate discriminants in Higgs searches, introducing the Significance Improvement Characteristic (SIC) as a practical visualization and ranking tool. Using Boosted Decision Trees and a broad set of discriminants, including novel jet-substructure observables like twist, azilicity, and pull, the authors demonstrate potential 10-20% gains in significance for VH (H→bb) searches at the Tevatron and LHC against irreducible Z/W+bb backgrounds. A key finding is that discrimination improves with a carefully chosen set of 8–9 variables, and that jet reconstruction and multiple bb-mass measures can further aid performance, especially when combined with radiation observables. The work provides a general, generator-aware framework for multivariate optimization that can inform future detector-level studies and refine search strategies for the light Higgs in VH channels at hadron colliders.

Abstract

A systematic method for optimizing multivariate discriminants is developed and applied to the important example of a light Higgs boson search at the Tevatron and the LHC. The Significance Improvement Characteristic (SIC), defined as the signal efficiency of a cut or multivariate discriminant divided by the square root of the background efficiency, is shown to be an extremely powerful visualization tool. SIC curves demonstrate numerical instabilities in the multivariate discriminants, show convergence as the number of variables is increased, and display the sensitivity to the optimal cut values. For our application, we concentrate on Higgs boson production in association with a W or Z boson with H -> bb and compare to the irreducible standard model background, Z/W + bb. We explore thousands of experimentally motivated, physically motivated, and unmotivated single variable discriminants. Along with the standard kinematic variables, a number of new ones, such as twist, are described which should have applicability to many processes. We find that some single variables, such as the pull angle, are weak discriminants, but when combined with others they provide important marginal improvement. We also find that multiple Higgs boson-candidate mass measures, such as from mild and aggressively trimmed jets, when combined may provide additional discriminating power. Comparing the significance improvement from our variables to those used in recent CDF and DZero searches, we find that a 10-20% improvement in significance against Z/W + bb is possible. Our analysis also suggests that the H + W/Z channel with H -> bb is also viable at the LHC, without requiring a hard cut on the W/Z transverse momentum.

Multivariate discrimination and the Higgs + W/Z search

TL;DR

The paper develops a rigorous, systematic framework for optimizing multivariate discriminants in Higgs searches, introducing the Significance Improvement Characteristic (SIC) as a practical visualization and ranking tool. Using Boosted Decision Trees and a broad set of discriminants, including novel jet-substructure observables like twist, azilicity, and pull, the authors demonstrate potential 10-20% gains in significance for VH (H→bb) searches at the Tevatron and LHC against irreducible Z/W+bb backgrounds. A key finding is that discrimination improves with a carefully chosen set of 8–9 variables, and that jet reconstruction and multiple bb-mass measures can further aid performance, especially when combined with radiation observables. The work provides a general, generator-aware framework for multivariate optimization that can inform future detector-level studies and refine search strategies for the light Higgs in VH channels at hadron colliders.

Abstract

A systematic method for optimizing multivariate discriminants is developed and applied to the important example of a light Higgs boson search at the Tevatron and the LHC. The Significance Improvement Characteristic (SIC), defined as the signal efficiency of a cut or multivariate discriminant divided by the square root of the background efficiency, is shown to be an extremely powerful visualization tool. SIC curves demonstrate numerical instabilities in the multivariate discriminants, show convergence as the number of variables is increased, and display the sensitivity to the optimal cut values. For our application, we concentrate on Higgs boson production in association with a W or Z boson with H -> bb and compare to the irreducible standard model background, Z/W + bb. We explore thousands of experimentally motivated, physically motivated, and unmotivated single variable discriminants. Along with the standard kinematic variables, a number of new ones, such as twist, are described which should have applicability to many processes. We find that some single variables, such as the pull angle, are weak discriminants, but when combined with others they provide important marginal improvement. We also find that multiple Higgs boson-candidate mass measures, such as from mild and aggressively trimmed jets, when combined may provide additional discriminating power. Comparing the significance improvement from our variables to those used in recent CDF and DZero searches, we find that a 10-20% improvement in significance against Z/W + bb is possible. Our analysis also suggests that the H + W/Z channel with H -> bb is also viable at the LHC, without requiring a hard cut on the W/Z transverse momentum.

Paper Structure

This paper contains 19 sections, 18 equations, 29 figures, 5 tables.

Figures (29)

  • Figure 1: Some lab-frame kinematic variables for $ZH$ signal (solid blue) and $Z b\bar{b}$ background (hashed red) at the LHC. Events satisfy selection cuts and the Higgs mass window cut, $90\,\mathrm{GeV} < {m_{b\bar{b}}} < 124\,\mathrm{GeV}$ . Horizontal axes are in radians or GeV as appropriate, and vertical axes are in arbitrary units with signal and background normalized to the same area.
  • Figure 2: $\Delta \eta_{b \bar{b}}$ vs $\Delta \phi_{b \bar{b}}$ for the Higgs boson signal (left) and the $gg$ initiated $Z b\bar{b}$ background dominant at the LHC (right). This is at the hard parton level, and for $p_T^H>50$ GeV. The difference is less dramatic for lower $p_T$ or for the Tevatron, where the $q \bar{q}$-initiated background dominates. Absolute values could also be taken for $b$ indistinguishable from $\bar{b}$.
  • Figure 3: Twist angle $\tau$ in 3D with the $b$ and $\bar{b}$ emerging from the interaction point. The twist angle is defined to be boost invariant and does not exactly correspond to the physical rotation angle of a plane. The case shown, however, has no longitudinal boost.
  • Figure 4: Twist angle distributions for $ZH$ signal (solid blue) and $Z {b\bar{b}}$ background (hashed red), for the LHC with no $p_T^Z$ cut. Madgraph hard parton-level with no cuts (top) and showered jet-level with detector cuts (bottom). Both are shown only in the Higgs mass-window, $90\,\mathrm{GeV} < {m_{b\bar{b}}} < 124\,\mathrm{GeV}$ . Vertical axes are in arbitrary units with signal and background normalized to the same area.
  • Figure 5: Helicity angle $\theta_h$ and azilicity angle $\phi_{a}$ for $H \rightarrow b \bar{b}$. Angles can also be defined for the leptons on the $Z \rightarrow \ell^+ \ell^-$ side of the event.
  • ...and 24 more figures