Table of Contents
Fetching ...

Jet Flavor Classification in High-Energy Physics with Deep Neural Networks

Daniel Guest, Julian Collado, Pierre Baldi, Shih-Chieh Hsu, Gregor Urban, Daniel Whiteson

TL;DR

This work tackles jet flavor tagging in high-energy physics, where distinguishing heavy-flavor jets from light-flavor jets relies on high-dimensional tracking data. The authors compare three deep-learning architectures—feedforward, LSTM, and outer recursive networks—applied to data at three processing levels (tracks, vertices, and expert features), using GPU-accelerated training and a fixed-structure dataset to assess information loss across levels. They demonstrate that lower-level track and vertex information can outperform or augment traditional expert-feature baselines, and that combining low-level data with expert features yields the best overall performance, approaching or surpassing state-of-the-art taggers in several regimes. These findings guide the design of jet flavor tagging algorithms for experiments and highlight the importance of preserving information in high-dimensional detector data, with the objective score approaching the true likelihood ratio $\frac{P(\bar{x}|b)}{P(\bar{x}|q)}$ as closely as feasible.

Abstract

Classification of jets as originating from light-flavor or heavy-flavor quarks is an important task for inferring the nature of particles produced in high-energy collisions. The large and variable dimensionality of the data provided by the tracking detectors makes this task difficult. The current state-of-the-art tools require expert data-reduction to convert the data into a fixed low-dimensional form that can be effectively managed by shallow classifiers. We study the application of deep networks to this task, attempting classification at several levels of data, starting from a raw list of tracks. We find that the highest-level lowest-dimensionality expert information sacrifices information needed for classification, that the performance of current state-of-the-art taggers can be matched or slightly exceeded by deep-network-based taggers using only track and vertex information, that classification using only lowest-level highest-dimensionality tracking information remains a difficult task for deep networks, and that adding lower-level track and vertex information to the classifiers provides a significant boost in performance compared to the state-of-the-art.

Jet Flavor Classification in High-Energy Physics with Deep Neural Networks

TL;DR

This work tackles jet flavor tagging in high-energy physics, where distinguishing heavy-flavor jets from light-flavor jets relies on high-dimensional tracking data. The authors compare three deep-learning architectures—feedforward, LSTM, and outer recursive networks—applied to data at three processing levels (tracks, vertices, and expert features), using GPU-accelerated training and a fixed-structure dataset to assess information loss across levels. They demonstrate that lower-level track and vertex information can outperform or augment traditional expert-feature baselines, and that combining low-level data with expert features yields the best overall performance, approaching or surpassing state-of-the-art taggers in several regimes. These findings guide the design of jet flavor tagging algorithms for experiments and highlight the importance of preserving information in high-dimensional detector data, with the objective score approaching the true likelihood ratio as closely as feasible.

Abstract

Classification of jets as originating from light-flavor or heavy-flavor quarks is an important task for inferring the nature of particles produced in high-energy collisions. The large and variable dimensionality of the data provided by the tracking detectors makes this task difficult. The current state-of-the-art tools require expert data-reduction to convert the data into a fixed low-dimensional form that can be effectively managed by shallow classifiers. We study the application of deep networks to this task, attempting classification at several levels of data, starting from a raw list of tracks. We find that the highest-level lowest-dimensionality expert information sacrifices information needed for classification, that the performance of current state-of-the-art taggers can be matched or slightly exceeded by deep-network-based taggers using only track and vertex information, that classification using only lowest-level highest-dimensionality tracking information remains a difficult task for deep networks, and that adding lower-level track and vertex information to the classifiers provides a significant boost in performance compared to the state-of-the-art.

Paper Structure

This paper contains 12 sections, 11 figures, 1 table.

Figures (11)

  • Figure 1: Distributions in simulated samples of high-level jet flavor variables widely used to discriminate between jets from light-flavor and heavy-flavor quarks.
  • Figure 2: Top: Distribution of the number of tracks associated to a jet in simulated samples. Bottom: Distribution of the number of vertices associated to a jet in simulated samples, before and after removing tracks which exceed the maximum allowed value of 15.
  • Figure 3: Feedforward neural network architecture. In the first layer, connections of the same color represent the same value of the shared weight. The others layers are fully connected without shared weights.
  • Figure 4: Architecture of the Long Short Term Memory networks as described in the text.
  • Figure 5: Architecture of the outer recursive networks as described in the text.
  • ...and 6 more figures