Table of Contents
Fetching ...

Flow Matching Beyond Kinematics: Generating Jets with Particle-ID and Trajectory Displacement Information

Joschka Birk, Erik Buhmann, Cedric Ewen, Gregor Kasieczka, David Shih

TL;DR

This work develops a permutation-equivariant continuous normalizing flow (CNF) to generate jet constituents at the particle level, trained via flow matching on the expansive JetClass dataset. By conditioning on jet type, a single model captures ten jet classes and models beyond-kinematic features such as particle IDs and trajectory displacement, using EPiC layers to handle permutation-invariant point clouds. The approach achieves strong agreement with real jets across constituent kinematics and jet substructure, demonstrated through KL divergences, Fréchet distances, and a classifier test with AUCs up to 0.829, while also highlighting ongoing challenges in complex 3-prong topologies. The work expands jet generative modeling beyond kinematics, enabling richer benchmarks and potential improvements in anomaly detection and differentiable analyses, with code released for reproducibility.

Abstract

We introduce the first generative model trained on the JetClass dataset. Our model generates jets at the constituent level, and it is a permutation-equivariant continuous normalizing flow (CNF) trained with the flow matching technique. It is conditioned on the jet type, so that a single model can be used to generate the ten different jet types of JetClass. For the first time, we also introduce a generative model that goes beyond the kinematic features of jet constituents. The JetClass dataset includes more features, such as particle-ID and track impact parameter, and we demonstrate that our CNF can accurately model all of these additional features as well. Our generative model for JetClass expands on the versatility of existing jet generation techniques, enhancing their potential utility in high-energy physics research, and offering a more comprehensive understanding of the generated jets.

Flow Matching Beyond Kinematics: Generating Jets with Particle-ID and Trajectory Displacement Information

TL;DR

This work develops a permutation-equivariant continuous normalizing flow (CNF) to generate jet constituents at the particle level, trained via flow matching on the expansive JetClass dataset. By conditioning on jet type, a single model captures ten jet classes and models beyond-kinematic features such as particle IDs and trajectory displacement, using EPiC layers to handle permutation-invariant point clouds. The approach achieves strong agreement with real jets across constituent kinematics and jet substructure, demonstrated through KL divergences, Fréchet distances, and a classifier test with AUCs up to 0.829, while also highlighting ongoing challenges in complex 3-prong topologies. The work expands jet generative modeling beyond kinematics, enabling richer benchmarks and potential improvements in anomaly detection and differentiable analyses, with code released for reproducibility.

Abstract

We introduce the first generative model trained on the JetClass dataset. Our model generates jets at the constituent level, and it is a permutation-equivariant continuous normalizing flow (CNF) trained with the flow matching technique. It is conditioned on the jet type, so that a single model can be used to generate the ten different jet types of JetClass. For the first time, we also introduce a generative model that goes beyond the kinematic features of jet constituents. The JetClass dataset includes more features, such as particle-ID and track impact parameter, and we demonstrate that our CNF can accurately model all of these additional features as well. Our generative model for JetClass expands on the versatility of existing jet generation techniques, enhancing their potential utility in high-energy physics research, and offering a more comprehensive understanding of the generated jets.
Paper Structure (11 sections, 1 equation, 10 figures, 7 tables)

This paper contains 11 sections, 1 equation, 10 figures, 7 tables.

Figures (10)

  • Figure 1: Schematic overview of the different jet constituent features available in the JetClass dataset. The horizontal line at the bottom represents the beam axis and the circle on this line represents the primary vertex (PV).
  • Figure 2: Overview of some of the features from the JetClass dataset: (a) shows the number of jet constituents, (b) shows the significance of the transverse impact parameter $d_0$, (c) shows the fraction of the jet $p_\text{T}$ carried by the jet constituent and (d) shows the difference $\eta^{\text{rel}}$ between the constituent pseudorapidity and the jet axis. The impact parameter significance $d_0 / \sigma_{d_0}$ is only shown for charged particles since the impact parameter is 0 for neutral particles. The number of jet constituents is a jet-level feature our model is conditioned on while the remaining features correspond to constituent-level features that our model generates.
  • Figure 3: Pair plot of the jet features of the JetClass dataset and the KDE samples for the $t\to bqq'$ and $q/g$ jet types. The features generated by the KDE are used as conditioning features for the final generative model. The diagonal shows the distribution of the individual features, while the off-diagonal plots show the correlation between the features. The lines in the off-diagonal plots correspond to iso-proportions (20%, 40%, 60% and 80%) of the density (e.g. 20% of the data points are outside the 20% line). Each of the 4 plotted datasets contains 100k jets.
  • Figure 4: Two kinematic features and the trajectory displacement of the jet constituents as generated by our model (solid lines) in comparison to the distributions obtained from the JetClass dataset (semi-transparent histograms). The upper row shows (a) the relative transverse momentum and (b) the relative pseudorapidity. The lower row shows (c) the transverse impact parameter significance and (d) the longitudinal impact parameter significance. Only charged particles are shown in the histograms of the trajectory displacement and the outermost bins show the underflow and overflow bins.
  • Figure 5: Pair plot of the kinematic features of the jet constituents for the $t\to bqq'$ and $q/g$ jet types illustrating that the model correctly captures the correlations between the different kinematic features. The lines in the off-diagonal plots correspond to iso-proportions (20%, 40%, 60% and 80%) of the density (e.g. 20% of the data points are outside the 20% line). Each of the 4 plotted datasets contains 100k constituents.
  • ...and 5 more figures