Lamarr: LHCb ultra-fast simulation based on machine learning models deployed within Gauss
Matteo Barbetti
TL;DR
The paper addresses the looming CPU burden of detailed LHCb simulation for Run 3 by introducing Lamarr, a Gaudi-based ultra-fast simulation framework that parameterizes detector response and reconstruction using ML models. It combines Gradient Boosted Decision Trees for efficiency metrics and Generative Adversarial Networks for high-level distributions, with training on detailed simulations and real data where possible, and deploys via scikinC within Gauss. Validation on Λb0 decays demonstrates good agreement with full simulations for tracking and PID, indicating reliable reproduction of detector effects. The work suggests substantial speedups (up to 1000x in preliminary studies) and outlines ongoing improvements for neutral particles and broader data-driven calibration, highlighting significant practical impact for LHCb data analyses.
Abstract
About 90% of the computing resources available to the LHCb experiment has been spent to produce simulated data samples for Run 2 of the Large Hadron Collider at CERN. The upgraded LHCb detector will be able to collect larger data samples, requiring many more simulated events to analyze the data to be collected in Run 3. Simulation is a key necessity of analysis to interpret signal, reject background and measure efficiencies. The needed simulation will far exceed the pledged resources, requiring an evolution in technologies and techniques to produce these simulated data samples. In this contribution, we discuss Lamarr, a Gaudi-based framework to speed-up the simulation production parameterizing both the detector response and the reconstruction algorithms of the LHCb experiment. Deep Generative Models powered by several algorithms and strategies are employed to effectively parameterize the high-level response of the single components of the LHCb detector, encoding within neural networks the experimental errors and uncertainties introduced in the detection and reconstruction phases. Where possible, models are trained directly on real data, statistically subtracting any background components by applying appropriate reweighing procedures. Embedding Lamarr in the general LHCb Gauss Simulation framework allows to combine its execution with any of the available generators in a seamless way. The resulting software package enables a simulation process independent of the detailed simulation used to date.
