Table of Contents
Fetching ...

TrackSorter: A Transformer-based sorting algorithm for track finding in High Energy Physics

Yash Melkani, Xiangyang Ju

TL;DR

This work tackles track finding in High Energy Physics under HL-LHC conditions by reframing the problem as a sorting task. It introduces TrackSorter, a Transformer-based sequence-to-sequence model that tokenizes space points into discrete tokens and outputs track candidates via end-to-end learning. Using TrackML data with realistic pileup, the model demonstrates robust reconstruction efficiency across track lengths and kinematics, and analyzes inference behavior with greedy decoding. The study suggests potential for scaling with larger context windows and future integration with larger language models to enhance pattern recognition in particle tracking.

Abstract

Track finding in particle data is a challenging pattern recognition problem in High Energy Physics. It takes as inputs a point cloud of space points and labels them so that space points created by the same particle have the same label. The list of space points with the same label is a track candidate. We argue that this pattern recognition problem can be formulated as a sorting problem, of which the inputs are a list of space points sorted by their distances away from the collision points and the outputs are the space points sorted by their labels. In this paper, we propose the TrackSorter algorithm: a Transformer-based algorithm for pattern recognition in particle data. TrackSorter uses a simple tokenization scheme to convert space points into discrete tokens. It then uses the tokenized space points as inputs and sorts the input tokens into track candidates. TrackSorter is a novel end-to-end track finding algorithm that leverages Transformer-based models to solve pattern recognition problems. It is evaluated on the TrackML dataset and has good track finding performance.

TrackSorter: A Transformer-based sorting algorithm for track finding in High Energy Physics

TL;DR

This work tackles track finding in High Energy Physics under HL-LHC conditions by reframing the problem as a sorting task. It introduces TrackSorter, a Transformer-based sequence-to-sequence model that tokenizes space points into discrete tokens and outputs track candidates via end-to-end learning. Using TrackML data with realistic pileup, the model demonstrates robust reconstruction efficiency across track lengths and kinematics, and analyzes inference behavior with greedy decoding. The study suggests potential for scaling with larger context windows and future integration with larger language models to enhance pattern recognition in particle tracking.

Abstract

Track finding in particle data is a challenging pattern recognition problem in High Energy Physics. It takes as inputs a point cloud of space points and labels them so that space points created by the same particle have the same label. The list of space points with the same label is a track candidate. We argue that this pattern recognition problem can be formulated as a sorting problem, of which the inputs are a list of space points sorted by their distances away from the collision points and the outputs are the space points sorted by their labels. In this paper, we propose the TrackSorter algorithm: a Transformer-based algorithm for pattern recognition in particle data. TrackSorter uses a simple tokenization scheme to convert space points into discrete tokens. It then uses the tokenized space points as inputs and sorts the input tokens into track candidates. TrackSorter is a novel end-to-end track finding algorithm that leverages Transformer-based models to solve pattern recognition problems. It is evaluated on the TrackML dataset and has good track finding performance.
Paper Structure (5 sections, 3 figures)

This paper contains 5 sections, 3 figures.

Figures (3)

  • Figure 1: Illustration of the TrackSorter algorithm. Each box represents a space point, with the token ID inside. [SEP] is a special token indicating the end of a track. $r$ is the distance between the space point and the collision point in the transverse plan.
  • Figure 2: The detector schematic shows the top half of the detector projected on the r-z plane. The z-axis is along the beam direction.
  • Figure 3: Top row: distribution of track length (left) and track $p_\text{T}$ (right) in the test dataset. Bottom row: Tracking efficiency as a function of the track length (left) and particle transverse momentum (right).