Jet Flavor Classification in High-Energy Physics with Deep Neural Networks
Daniel Guest, Julian Collado, Pierre Baldi, Shih-Chieh Hsu, Gregor Urban, Daniel Whiteson
TL;DR
This work tackles jet flavor tagging in high-energy physics, where distinguishing heavy-flavor jets from light-flavor jets relies on high-dimensional tracking data. The authors compare three deep-learning architectures—feedforward, LSTM, and outer recursive networks—applied to data at three processing levels (tracks, vertices, and expert features), using GPU-accelerated training and a fixed-structure dataset to assess information loss across levels. They demonstrate that lower-level track and vertex information can outperform or augment traditional expert-feature baselines, and that combining low-level data with expert features yields the best overall performance, approaching or surpassing state-of-the-art taggers in several regimes. These findings guide the design of jet flavor tagging algorithms for experiments and highlight the importance of preserving information in high-dimensional detector data, with the objective score approaching the true likelihood ratio $\frac{P(\bar{x}|b)}{P(\bar{x}|q)}$ as closely as feasible.
Abstract
Classification of jets as originating from light-flavor or heavy-flavor quarks is an important task for inferring the nature of particles produced in high-energy collisions. The large and variable dimensionality of the data provided by the tracking detectors makes this task difficult. The current state-of-the-art tools require expert data-reduction to convert the data into a fixed low-dimensional form that can be effectively managed by shallow classifiers. We study the application of deep networks to this task, attempting classification at several levels of data, starting from a raw list of tracks. We find that the highest-level lowest-dimensionality expert information sacrifices information needed for classification, that the performance of current state-of-the-art taggers can be matched or slightly exceeded by deep-network-based taggers using only track and vertex information, that classification using only lowest-level highest-dimensionality tracking information remains a difficult task for deep networks, and that adding lower-level track and vertex information to the classifiers provides a significant boost in performance compared to the state-of-the-art.
