Table of Contents
Fetching ...

Transformers for Charged Particle Track Reconstruction in High Energy Physics

Samuel Van Stroud, Philippa Duckett, Max Hart, Nikita Pond, Sébastien Rettie, Gabriel Facini, Tim Scanlon

TL;DR

The paper tackles the HL-LHC track-reconstruction bottleneck by introducing a two-stage Transformer-based pipeline: a windowed-hit filtering stage to dramatically reduce hit multiplicity, and a MaskFormer-inspired track reconstruction stage that jointly assigns hits and regresses track parameters. Evaluated on TrackML, the approach achieves a state-of-the-art $\approx$97% tracking efficiency with a low fake rate around $0.6\%$ and latency near $100\,$ms per event on a standard GPU, demonstrating both high accuracy and real-time feasibility. The work highlights two innovations—phi-local, windowed self-attention for linear-scaling hit filtering and an end-to-end learnable MaskFormer-style tracker—that enable flexible deployment from trigger-level to offline reconstruction. While results on TrackML are strong, the paper discusses the need for validation under realistic detector conditions, integration with existing frameworks, and possible extensions to include strip-layer data and uncertainty estimates, aiming for broader applicability in high-energy physics reconstruction. The findings suggest a promising direction toward unified, scalable learned reconstruction methods that can adapt to diverse detectors and operational requirements.

Abstract

Reconstructing charged particle tracks is a fundamental task in modern collider experiments. The unprecedented particle multiplicities expected at the High-Luminosity Large Hadron Collider (HL-LHC) pose significant challenges for track reconstruction, where traditional algorithms become computationally infeasible. To address this challenge, we present a novel learned approach to track reconstruction that adapts recent advances in computer vision and object detection. Our architecture combines a Transformer hit filtering network with a MaskFormer reconstruction model that jointly optimises hit assignments and the estimation of the charged particles' properties. Evaluated on the TrackML dataset, our best performing model achieves state-of-the-art tracking performance with 97% efficiency for a fake rate of 0.6%, and inference times of 100ms. Our tunable approach enables specialisation for specific applications like triggering systems, while its underlying principles can be extended to other reconstruction challenges in high energy physics. This work demonstrates the potential of modern deep learning architectures to address emerging computational challenges in particle physics while maintaining the precision required for groundbreaking physics analysis.

Transformers for Charged Particle Track Reconstruction in High Energy Physics

TL;DR

The paper tackles the HL-LHC track-reconstruction bottleneck by introducing a two-stage Transformer-based pipeline: a windowed-hit filtering stage to dramatically reduce hit multiplicity, and a MaskFormer-inspired track reconstruction stage that jointly assigns hits and regresses track parameters. Evaluated on TrackML, the approach achieves a state-of-the-art 97% tracking efficiency with a low fake rate around and latency near ms per event on a standard GPU, demonstrating both high accuracy and real-time feasibility. The work highlights two innovations—phi-local, windowed self-attention for linear-scaling hit filtering and an end-to-end learnable MaskFormer-style tracker—that enable flexible deployment from trigger-level to offline reconstruction. While results on TrackML are strong, the paper discusses the need for validation under realistic detector conditions, integration with existing frameworks, and possible extensions to include strip-layer data and uncertainty estimates, aiming for broader applicability in high-energy physics reconstruction. The findings suggest a promising direction toward unified, scalable learned reconstruction methods that can adapt to diverse detectors and operational requirements.

Abstract

Reconstructing charged particle tracks is a fundamental task in modern collider experiments. The unprecedented particle multiplicities expected at the High-Luminosity Large Hadron Collider (HL-LHC) pose significant challenges for track reconstruction, where traditional algorithms become computationally infeasible. To address this challenge, we present a novel learned approach to track reconstruction that adapts recent advances in computer vision and object detection. Our architecture combines a Transformer hit filtering network with a MaskFormer reconstruction model that jointly optimises hit assignments and the estimation of the charged particles' properties. Evaluated on the TrackML dataset, our best performing model achieves state-of-the-art tracking performance with 97% efficiency for a fake rate of 0.6%, and inference times of 100ms. Our tunable approach enables specialisation for specific applications like triggering systems, while its underlying principles can be extended to other reconstruction challenges in high energy physics. This work demonstrates the potential of modern deep learning architectures to address emerging computational challenges in particle physics while maintaining the precision required for groundbreaking physics analysis.

Paper Structure

This paper contains 25 sections, 1 equation, 10 figures, 5 tables.

Figures (10)

  • Figure 1: Overview of the track reconstruction model, with data and operations being shown in green and blue respectively. $M$ input tokens representing the hits are fed into an initial Transformer encoder. The object decoder then takes a set of $N$ object queries, which represent tracks, and iteratively updates them with information from the input elements and other object queries. Finally, three task heads are used to: categorise each object as being from one of $C+1$ object classes (including a null class), estimate $R$ regression targets, and predict $N \times M$ binary masks which provide the assignment of input hits to output tracks. The resulting embedded queries and hits are then fed into another decoder layer to refine the predictions and produce an intermediate auxiliary loss for each layer. These decoder layers can be stacked repeatedly to increase accuracy at the expense of computational cost. On the right, a detailed view of an object decoder layer is shown.
  • Figure 2: Event display in the transverse $x$-$y$ plane, showing a view down the beam-line. This projection captures the cylindrical detector cross-section perpendicular to the beam axis ($z$), with the origin corresponding to the nominal interaction point. Hits and tracks are shown in the transverse plane, where particle trajectories typically curve due to the magnetic field oriented along the $z$-axis. (Left) The positions of pixel hits for a single event. (Middle) Hits passing the 750 filter at a cut of 0.1. (Right) Filtered hits along with the trajectories of reconstructable particles (assuming a homogeneous 2T magnetic field) that satisfy $\maybebmsf{p_{\mathrm{T}}}\xspace > 1\GeV$ and $|\eta|<2.5$.
  • Figure 3: Event display in the $r$-$z$ projection of a cylindrical detector. The horizontal axis corresponds to the beam-line direction ($z$), and the vertical axis is the radial distance from the beam-line ($r$). The azimuthal coordinate ($\phi$) is suppressed in this view, such that tracks and hits are projected onto the $r$-$z$ plane irrespective of their azimuthal angle. (Left) The positions of pixel hits for a single event. (Middle) Hits passing the 750 filter at a cut of 0.1. (Right) Filtered hits along with the trajectories of reconstructable particles (assuming a homogeneous 2T magnetic field) that satisfy $\maybebmsf{p_{\mathrm{T}}}\xspace > 1\GeV$ and $|\eta|<2.5$.
  • Figure 4: Histograms summarising the kinematics and object multiplicities of the TrackML dataset restricted to the inner pixel layers. (Top left) The $\maybebmsf{p_{\mathrm{T}}}$ distribution of all particles with the last bin including overflow. (Top right) A histogram showing the distribution of the pseudorapidity $\eta$ of the particles. (Bottom left) The distribution of the number of pixel layer hits in the events before filtering. (Bottom right) The distribution of the number of total particles in each event. The distribution for all particles is shown in black. Also shown are the number of particles left after applying the three $\maybebmsf{p_{\mathrm{T}}}\xspace$ cuts used in this work.
  • Figure 5: Hit filtering performance for each of the three models. (Left) Signal hit purity as a function of the signal hit efficiency. The markers show the efficiency and purity at the chosen threshold of 0.1 and each model achieves an area under the curve of 0.998. (Right) The fraction of particles that remain reconstructable as a function of simulated particle $\maybebmsf{p_{\mathrm{T}}}$ after filtering hits that fall below the 0.1 threshold. Binominal errors are indicated by the shaded regions, and the final bin includes overflow.
  • ...and 5 more figures