TrackFormers Part 2: Enhanced Transformer-Based Models for High-Energy Physics Track Reconstruction
Sascha Caron, Nadezhda Dobreva, Maarten Kimpel, Uraz Odyurt, Slav Pshenov, Roberto Ruiz de Austri Bazan, Eugene Shalugin, Zef Wolffs, Yue Zhao
TL;DR
The paper tackles the scalability of track reconstruction at the HL-LHC by extending TrackFormers with projection-based hit geometry, lightweight clustering, and a two-stage encoder-only transformer that jointly regresses track parameters and classifies hits. It introduces FlexAttention with BlockMask to handle variable-length sequences, and a contrastive learning framework for hit-to-track association within a single forward pass. A reproducible ACTS-based hit-level dataset for $pp \rightarrow t\bar{t}H$ with $H\rightarrow b\bar{b}$ and $pp \rightarrow t\bar{t}$ backgrounds across pileups $0$–$200$ is provided, enabling realistic benchmarking. The results show tens-of-milliseconds end-to-end inference with about 90–91% track-efficiency in the barrel and endcaps, and up to 400× reduction in attention cost, with deeper encoder-only models delivering consistent accuracy gains, making the approach appealing for HL-LHC deployment.
Abstract
High-Energy Physics experiments are rapidly escalating in generated data volume, a trend that will intensify with the upcoming High-Luminosity LHC upgrade. This surge in data necessitates critical revisions across the data processing pipeline, with particle track reconstruction being a prime candidate for improvement. In our previous work, we introduced "TrackFormers", a collection of Transformer-based one-shot encoder-only models that effectively associate hits with expected tracks. In this study, we extend our earlier efforts by incorporating loss functions that account for inter-hit correlations, conducting detailed investigations into (various) Transformer attention mechanisms, and a study on the reconstruction of higher-level objects. Furthermore we discuss new datasets that allow the training on hit level for a range of physics processes. These developments collectively aim to boost both the accuracy, and potentially the efficiency of our tracking models, offering a robust solution to meet the demands of next-generation high-energy physics experiments.
