Table of Contents
Fetching ...

W-jet Tagging: Optimizing the Identification of Boosted Hadronically-Decaying W Bosons

Yanou Cui, Zhenyu Han, Matthew D. Schwartz

TL;DR

boosted hadronically decaying W jets can be distinguished from QCD jets by exploiting jet substructure; the authors introduce novel discriminants (R-cores, planar flow, grooming-sensitivities) and combine them with grooming using Boosted Decision Trees. They demonstrate significant improvements in significance (S/√B) up to ~5x over grooming alone, especially at high pT, and illustrate applications to Z′ searches and hadronic W+jet analyses at the LHC. The work also analyzes W-polarization effects and cross-checks with different Monte Carlo tools, providing public code for W-jet tagging. Overall, the approach offers a powerful, generalizable framework for identifying boosted hadronic W decays in high-energy collisions.

Abstract

A method is proposed for distinguishing highly boosted hadronically decaying W's (W-jets) from QCD-jets using jet substructure. Previous methods, such as the filtering/mass-drop method, can give a factor of ~2 improvement in S/sqrt(B) for jet pT > 200 GeV. In contrast, a multivariate approach including new discriminants such as R-cores, which characterize the shape of the W-jet, subjet planar flow, and grooming-sensitivities is shown to provide a much larger factor of ~5 improvement in S/sqrt(B). For longitudinally polarized W's, such as those coming from many new physics models, the discrimination is even better. Comparing different Monte Carlo simulations, we observe a sensitivity of some variables to the underlying event; however, even with a conservative estimates, the multivariate approach is very powerful. Applications to semileptonic WW resonance searches and all-hadronic W+jet searches at the LHC are also discussed. Code implementing our W-jet tagging algorithm is publicly available at http://jets.physics.harvard.edu/wtag

W-jet Tagging: Optimizing the Identification of Boosted Hadronically-Decaying W Bosons

TL;DR

boosted hadronically decaying W jets can be distinguished from QCD jets by exploiting jet substructure; the authors introduce novel discriminants (R-cores, planar flow, grooming-sensitivities) and combine them with grooming using Boosted Decision Trees. They demonstrate significant improvements in significance (S/√B) up to ~5x over grooming alone, especially at high pT, and illustrate applications to Z′ searches and hadronic W+jet analyses at the LHC. The work also analyzes W-polarization effects and cross-checks with different Monte Carlo tools, providing public code for W-jet tagging. Overall, the approach offers a powerful, generalizable framework for identifying boosted hadronic W decays in high-energy collisions.

Abstract

A method is proposed for distinguishing highly boosted hadronically decaying W's (W-jets) from QCD-jets using jet substructure. Previous methods, such as the filtering/mass-drop method, can give a factor of ~2 improvement in S/sqrt(B) for jet pT > 200 GeV. In contrast, a multivariate approach including new discriminants such as R-cores, which characterize the shape of the W-jet, subjet planar flow, and grooming-sensitivities is shown to provide a much larger factor of ~5 improvement in S/sqrt(B). For longitudinally polarized W's, such as those coming from many new physics models, the discrimination is even better. Comparing different Monte Carlo simulations, we observe a sensitivity of some variables to the underlying event; however, even with a conservative estimates, the multivariate approach is very powerful. Applications to semileptonic WW resonance searches and all-hadronic W+jet searches at the LHC are also discussed. Code implementing our W-jet tagging algorithm is publicly available at http://jets.physics.harvard.edu/wtag

Paper Structure

This paper contains 17 sections, 8 equations, 26 figures, 3 tables.

Figures (26)

  • Figure 1: Significance Improvement Characteristics ($\varepsilon_S/\sqrt{\varepsilon_B}$) for leptonic-$W$+$W$-jet events (signal) versus their leptonic-$W$+QCD-jet background, for $p_T^{\text{jet}}\in(500, 550)~{\hbox{GeV}}$. The bottom two curves show the effect of an optimized simple mass window for $R=1.2$ and $R=0.4$ Cambridge/Aachen jets. The falloff of the $R=0.4$ efficiencies is due to events in which the $W$-subjets are well separated. The next curve up shows the efficiency of the filtering-with-mass-drop method of filtering, optimized over the filtering parameters. The top curve is the result of our multivariate analysis, including many variables on top of the filtered result. The starting point for the multivariate analysis is a filtered sample with a window slightly wider than what is optimal for filtering, as indicated by the star.
  • Figure 2: Jet masses before and after filtering/mass-drop for $p_T^{\rm jet}\in(500, 550)~{\hbox{GeV}}$. The numbers of events are normalized to be the same for the signal and the background. (a) Before filtering; (b) after filtering with $\mu =0.71$ and $y_{\rm cut}=0.09$. When a mass-drop is not found, we add an entry in the zero mass bin such that the total number of jets is unchanged.
  • Figure 3: The significance improvement characteristic (SIC$\equiv\varepsilon_S/\sqrt{\varepsilon_B}$) as a function of the filtering parameters, $\mu$ and $y_{\rm cut}$, for $p_T^{\rm jet}\in(500, 550)~{\hbox{GeV}}$.
  • Figure 4: Tuning of filtering parameters for $W$-jets versus QCD-jets in the standard model.
  • Figure 5: Distributions for the fat-jet mass and hardest subjet mass for signal ($W$-jets) and background (QCD-jets) with $p_T^{\rm jet}\in(500,550)$ GeV. The edge at 60 GeV in the jet mass plot follows from a preselection cut on the filtered mass, $m_{\text{filt}} \in (60,100)~{\hbox{GeV}}$.
  • ...and 21 more figures