Table of Contents
Fetching ...

TrackFormers: In Search of Transformer-Based Particle Tracking for the High-Luminosity LHC Era

Sascha Caron, Nadezhda Dobreva, Antonio Ferrer Sánchez, José D. Martín-Guerrero, Uraz Odyurt, Roberto Ruiz de Austri Bazan, Zef Wolffs, Yue Zhao

TL;DR

TrackFormers investigates transformer-based particle tracking for the HL-LHC era by evaluating three Transformer-based designs (EncDec, EncCla, EncReg) and a sparse U-Net on REDVID and TrackML datasets. The study demonstrates that single-shot encoder-classifier and encoder-regressor architectures offer strong predictive performance with favorable inference times, while autoregressive EncDec remains slower but provides complementary capabilities. Through a multi-dataset, multi-architecture evaluation, the authors show the viability of ML-first, single-pass tracking and discuss scaling, post-processing, and memory-optimized variants such as Flash Attention. The results highlight practical pathways toward deploying ML-based tracking in HL-LHC pipelines and outline concrete directions for speedups and accuracy enhancements, with data and code openly available for community use.

Abstract

High-Energy Physics experiments are facing a multi-fold data increase with every new iteration. This is certainly the case for the upcoming High-Luminosity LHC upgrade. Such increased data processing requirements forces revisions to almost every step of the data processing pipeline. One such step in need of an overhaul is the task of particle track reconstruction, a.k.a., tracking. A Machine Learning-assisted solution is expected to provide significant improvements, since the most time-consuming step in tracking is the assignment of hits to particles or track candidates. This is the topic of this paper. We take inspiration from large language models. As such, we consider two approaches: the prediction of the next word in a sentence (next hit point in a track), as well as the one-shot prediction of all hits within an event. In an extensive design effort, we have experimented with three models based on the Transformer architecture and one model based on the U-Net architecture, performing track association predictions for collision event hit points. In our evaluation, we consider a spectrum of simple to complex representations of the problem, eliminating designs with lower metrics early on. We report extensive results, covering both prediction accuracy (score) and computational performance. We have made use of the REDVID simulation framework, as well as reductions applied to the TrackML data set, to compose five data sets from simple to complex, for our experiments. The results highlight distinct advantages among different designs in terms of prediction accuracy and computational performance, demonstrating the efficiency of our methodology. Most importantly, the results show the viability of a one-shot encoder-classifier based Transformer solution as a practical approach for the task of tracking.

TrackFormers: In Search of Transformer-Based Particle Tracking for the High-Luminosity LHC Era

TL;DR

TrackFormers investigates transformer-based particle tracking for the HL-LHC era by evaluating three Transformer-based designs (EncDec, EncCla, EncReg) and a sparse U-Net on REDVID and TrackML datasets. The study demonstrates that single-shot encoder-classifier and encoder-regressor architectures offer strong predictive performance with favorable inference times, while autoregressive EncDec remains slower but provides complementary capabilities. Through a multi-dataset, multi-architecture evaluation, the authors show the viability of ML-first, single-pass tracking and discuss scaling, post-processing, and memory-optimized variants such as Flash Attention. The results highlight practical pathways toward deploying ML-based tracking in HL-LHC pipelines and outline concrete directions for speedups and accuracy enhancements, with data and code openly available for community use.

Abstract

High-Energy Physics experiments are facing a multi-fold data increase with every new iteration. This is certainly the case for the upcoming High-Luminosity LHC upgrade. Such increased data processing requirements forces revisions to almost every step of the data processing pipeline. One such step in need of an overhaul is the task of particle track reconstruction, a.k.a., tracking. A Machine Learning-assisted solution is expected to provide significant improvements, since the most time-consuming step in tracking is the assignment of hits to particles or track candidates. This is the topic of this paper. We take inspiration from large language models. As such, we consider two approaches: the prediction of the next word in a sentence (next hit point in a track), as well as the one-shot prediction of all hits within an event. In an extensive design effort, we have experimented with three models based on the Transformer architecture and one model based on the U-Net architecture, performing track association predictions for collision event hit points. In our evaluation, we consider a spectrum of simple to complex representations of the problem, eliminating designs with lower metrics early on. We report extensive results, covering both prediction accuracy (score) and computational performance. We have made use of the REDVID simulation framework, as well as reductions applied to the TrackML data set, to compose five data sets from simple to complex, for our experiments. The results highlight distinct advantages among different designs in terms of prediction accuracy and computational performance, demonstrating the efficiency of our methodology. Most importantly, the results show the viability of a one-shot encoder-classifier based Transformer solution as a practical approach for the task of tracking.
Paper Structure (29 sections, 3 equations, 7 figures, 2 tables)

This paper contains 29 sections, 3 equations, 7 figures, 2 tables.

Figures (7)

  • Figure 1: The fully parametric detector geometry, allowing for inclusion/exclusion of different sub-detector types, with full control over sub-layer counts, sizes and placements.
  • Figure 2: Intuitive visualisations of the inner-workings of three Transformer model designs, EncDec, EncCla and EncReg, respectively. We showcase in a simplistic manner the hit processing applied by each pipeline. Hits are represented by dots, with gray dots representing no track associations, and dots with the same colour belonging to the same track.
  • Figure 3: Depicting the high-level views of dedicated workflows for each ML model design.
  • Figure 4: FitAccuracy score, as defined in Section 4, and the fake rate, which represents the fraction of predicted tracks that do not correspond to true tracks, as a function of physics observables for the EncCla and EncReg models on the TrackML 10-50 tracks data set, in black and red, respectively. (a) FitAccuracy as a function of the transverse momentum $p_T$, (b) FitAccuracy as a function of the pseudorapidity $\eta$, (c) fake rate as a function of the transverse momentum $p_T$, and (d) fake rate as a function of the pseudorapidity $\eta$. The uncertainties represent the 68% confidence intervals and were calculated via a bootstrapping procedure. The decreasing trend of EncCla in (d) is due to the increase of bin size with $\eta$, which affects both the number of correctly predicted tracks and the total number of predicted tracks.
  • Figure 5: Mean inference GPU-time for the EncReg-FA model trained on the 200–500 tracks per event data set, plotted as a function of the number of tracks per event.
  • ...and 2 more figures