Table of Contents
Fetching ...

Deploying a Hybrid PVFinder Algorithm for Primary Vertex Reconstruction in LHCb's GPU-Resident HLT1

Simon Akar, Mohamed Elashri, Conor Henderson, Michael Sokoloff

TL;DR

This work presents the development of an inference engine for PVFinder, a hybrid deep neural network for finding primary vertices, the proton-proton collision points from which all subsequent particle decays originate into Allen, LHCb's High Level Trigger (HLT1) framework.

Abstract

LHCb's Run 3 upgrade introduced a fully software-based trigger system operating at 30~MHz, processing an average of 5.6 proton-proton collision vertices per bunch crossing (event). This work presents the development of an inference engine for PVFinder, a hybrid deep neural network for finding primary vertices, the proton-proton collision points from which all subsequent particle decays originate into Allen, LHCb's High Level Trigger (HLT1) framework. The integration addresses critical real-time constraints including fixed memory pools, single-stream execution, and sub-400~$μ$s per-event processing budgets on NVIDIA GPUs. We introduce a translation layer that bridges Allen's Structure-of-Arrays (SoA) data layout with cuDNN's tensor format while maintaining zero-copy semantics and deterministic behavior. Current performance shows the CNN stage contributes significant throughput overhead. We present a roadmap targeting order-of-magnitude improvements through mixed-precision computing, model compression and other techniques.

Deploying a Hybrid PVFinder Algorithm for Primary Vertex Reconstruction in LHCb's GPU-Resident HLT1

TL;DR

This work presents the development of an inference engine for PVFinder, a hybrid deep neural network for finding primary vertices, the proton-proton collision points from which all subsequent particle decays originate into Allen, LHCb's High Level Trigger (HLT1) framework.

Abstract

LHCb's Run 3 upgrade introduced a fully software-based trigger system operating at 30~MHz, processing an average of 5.6 proton-proton collision vertices per bunch crossing (event). This work presents the development of an inference engine for PVFinder, a hybrid deep neural network for finding primary vertices, the proton-proton collision points from which all subsequent particle decays originate into Allen, LHCb's High Level Trigger (HLT1) framework. The integration addresses critical real-time constraints including fixed memory pools, single-stream execution, and sub-400~s per-event processing budgets on NVIDIA GPUs. We introduce a translation layer that bridges Allen's Structure-of-Arrays (SoA) data layout with cuDNN's tensor format while maintaining zero-copy semantics and deterministic behavior. Current performance shows the CNN stage contributes significant throughput overhead. We present a roadmap targeting order-of-magnitude improvements through mixed-precision computing, model compression and other techniques.
Paper Structure (6 sections, 2 figures, 2 tables)

This paper contains 6 sections, 2 figures, 2 tables.

Figures (2)

  • Figure 1: PVFinder physics performance showing efficiency vs. false positive rate for different configurations. The magenta configuration (FP32, 64-channel UNet) selected for deployment achieves $>$ 97% efficiency with 0.03 false positives per event, significantly outperforming the LHCb heuristic baseline dziurda_parallel_2025. FP16 configurations show minimal performance degradation.
  • Figure 2: PVFinder hybrid architecture showing the three-stage pipeline: FC layers process track parameters (9 features/track) into some representation, UNet CNN refines spatial patterns into probability histograms, and peak finding extracts vertex positions. The FC stage is implemented in native CUDA while the CNN stage uses cuDNN, bridged by the translation layer.