Table of Contents
Fetching ...

The LHCb ultra-fast simulation option, Lamarr: design and validation

Lucio Anderlini, Matteo Barbetti, Simone Capelli, Gloria Corti, Adam Davis, Denis Derkach, Nikita Kazeev, Artem Maevskiy, Maurizio Martinelli, Sergei Mokonenko, Benedetto Gianluca Siddi, Zehua Xu

TL;DR

The paper addresses the CPU bottleneck of full Geant4-based detector simulation at LHCb by introducing Lamarr, an ultra-fast simulation framework that maps generator-level particles to high-level detector responses through ML-based parameterizations deployed in two charge-based pipelines. Lamarr integrates with the Gauss simulation stack and demonstrates two-orders-of-magnitude speed-ups in the simulation phase while preserving agreement with detailed simulations for key observables, as shown in validation studies on Lambda_b decays. The approach leverages GANs for track smearing, GBDTs for acceptance, and specialized models for PID and calorimetry, with ongoing work on calorimeter translation using Graph Neural Networks and Transformers. Production-ready deployment is pursued through C-based deployment via scikinC, cvmfs distribution, and stand-alone options, aiming to relieve Run 3 resource pressure and potentially extend to other experiments.

Abstract

Detailed detector simulation is the major consumer of CPU resources at LHCb, having used more than 90% of the total computing budget during Run 2 of the Large Hadron Collider at CERN. As data is collected by the upgraded LHCb detector during Run 3 of the LHC, larger requests for simulated data samples are necessary, and will far exceed the pledged resources of the experiment, even with existing fast simulation options. An evolution of technologies and techniques to produce simulated samples is mandatory to meet the upcoming needs of analysis to interpret signal versus background and measure efficiencies. In this context, we propose Lamarr, a Gaudi-based framework designed to offer the fastest solution for the simulation of the LHCb detector. Lamarr consists of a pipeline of modules parameterizing both the detector response and the reconstruction algorithms of the LHCb experiment. Most of the parameterizations are made of Deep Generative Models and Gradient Boosted Decision Trees trained on simulated samples or alternatively, where possible, on real data. Embedding Lamarr in the general LHCb Gauss Simulation framework allows combining its execution with any of the available generators in a seamless way. Lamarr has been validated by comparing key reconstructed quantities with Detailed Simulation. Good agreement of the simulated distributions is obtained with two-order-of-magnitude speed-up of the simulation phase.

The LHCb ultra-fast simulation option, Lamarr: design and validation

TL;DR

The paper addresses the CPU bottleneck of full Geant4-based detector simulation at LHCb by introducing Lamarr, an ultra-fast simulation framework that maps generator-level particles to high-level detector responses through ML-based parameterizations deployed in two charge-based pipelines. Lamarr integrates with the Gauss simulation stack and demonstrates two-orders-of-magnitude speed-ups in the simulation phase while preserving agreement with detailed simulations for key observables, as shown in validation studies on Lambda_b decays. The approach leverages GANs for track smearing, GBDTs for acceptance, and specialized models for PID and calorimetry, with ongoing work on calorimeter translation using Graph Neural Networks and Transformers. Production-ready deployment is pursued through C-based deployment via scikinC, cvmfs distribution, and stand-alone options, aiming to relieve Run 3 resource pressure and potentially extend to other experiments.

Abstract

Detailed detector simulation is the major consumer of CPU resources at LHCb, having used more than 90% of the total computing budget during Run 2 of the Large Hadron Collider at CERN. As data is collected by the upgraded LHCb detector during Run 3 of the LHC, larger requests for simulated data samples are necessary, and will far exceed the pledged resources of the experiment, even with existing fast simulation options. An evolution of technologies and techniques to produce simulated samples is mandatory to meet the upcoming needs of analysis to interpret signal versus background and measure efficiencies. In this context, we propose Lamarr, a Gaudi-based framework designed to offer the fastest solution for the simulation of the LHCb detector. Lamarr consists of a pipeline of modules parameterizing both the detector response and the reconstruction algorithms of the LHCb experiment. Most of the parameterizations are made of Deep Generative Models and Gradient Boosted Decision Trees trained on simulated samples or alternatively, where possible, on real data. Embedding Lamarr in the general LHCb Gauss Simulation framework allows combining its execution with any of the available generators in a seamless way. Lamarr has been validated by comparing key reconstructed quantities with Detailed Simulation. Good agreement of the simulated distributions is obtained with two-order-of-magnitude speed-up of the simulation phase.
Paper Structure (9 sections, 4 figures)

This paper contains 9 sections, 4 figures.

Figures (4)

  • Figure 1: Schematic representation of the data processing flow in the detailed (top), fast (center) and ultra-fast (bottom) simulation paradigms.
  • Figure 2: Scheme of the $\hbox{\sc Lamarr}$ modular pipeline. According to the charge of the particle provided by the physics generator, two sets of parameterizations are defined: the charged particles are passed through the Tracking and PID models, while the neutral ones follow a different path where the calorimeter modeling plays a key role.
  • Figure 3: Distribution of the $(x, y)$-position of the reconstructed clusters on the LHCb ECAL face for a $2000 \times 1500~\rm{mm}^2$ frame placed around the center. The geometrical information is combined with the energy signature properly weighting each bin entry. What obtained from detailed simulation is reported on the left, while the predictions of an adversarial trained Transformer model is shown on the right. The corresponding LHCB-FIGURE is in preparation.
  • Figure 4: Validation plots for $\Lambda_b^0 \to \Lambda_c^+ \mu^- \bar{\nu}_\mu$ decays with $\Lambda_c^+ \to p K^- \pi^+$ simulated with $\hbox{\sc Pythia8}$, $\hbox{\sc EvtGen}$ and $\hbox{\sc Lamarr}$ (orange markers) and compared with detailed simulation samples relying on $\hbox{\sc Pythia8}$, $\hbox{\sc EvtGen}$ and $\hbox{\sc Geant4}$ (cyan shaded histogram). Reproduced from https://cds.cern.ch/record/2814081.