TrackSorter: A Transformer-based sorting algorithm for track finding in High Energy Physics
Yash Melkani, Xiangyang Ju
TL;DR
This work tackles track finding in High Energy Physics under HL-LHC conditions by reframing the problem as a sorting task. It introduces TrackSorter, a Transformer-based sequence-to-sequence model that tokenizes space points into discrete tokens and outputs track candidates via end-to-end learning. Using TrackML data with realistic pileup, the model demonstrates robust reconstruction efficiency across track lengths and kinematics, and analyzes inference behavior with greedy decoding. The study suggests potential for scaling with larger context windows and future integration with larger language models to enhance pattern recognition in particle tracking.
Abstract
Track finding in particle data is a challenging pattern recognition problem in High Energy Physics. It takes as inputs a point cloud of space points and labels them so that space points created by the same particle have the same label. The list of space points with the same label is a track candidate. We argue that this pattern recognition problem can be formulated as a sorting problem, of which the inputs are a list of space points sorted by their distances away from the collision points and the outputs are the space points sorted by their labels. In this paper, we propose the TrackSorter algorithm: a Transformer-based algorithm for pattern recognition in particle data. TrackSorter uses a simple tokenization scheme to convert space points into discrete tokens. It then uses the tokenized space points as inputs and sorts the input tokens into track candidates. TrackSorter is a novel end-to-end track finding algorithm that leverages Transformer-based models to solve pattern recognition problems. It is evaluated on the TrackML dataset and has good track finding performance.
