A First Full Physics Benchmark for Highly Granular Calorimeter Surrogates
Thorsten Buss, Henry Day-Hall, Frank Gaede, Gregor Kasieczka, Katja Krüger, Anatolii Korol, Thomas Madlener, Peter McKeown
TL;DR
The paper presents the first full-physics benchmark for highly granular calorimeter surrogates integrated into realistic detector simulations via the DDML library on DD4hep. It contrasts two advanced surrogates, ConvL2LFlows (grid-based) and CaloClouds (point-cloud), across multiple shower representations and benchmarks—single photons, di-photon separations, and tau hadronic decays—against Geant4. Results show that CaloClouds delivers substantial speedups with fidelity closely approaching ideal references, while grid-based methods suffer from representation-induced artifacts and limited containment, especially at higher energies. The work demonstrates that carefully chosen shower representations and tight integration into production software are critical to translating surrogate models into practical, reconstruction-level physics analyses, with region-specific calibration offering a path to further improvements.
Abstract
The physics programs of current and future collider experiments necessitate the development of surrogate simulators for calorimeter showers. While much progress has been made in the development of generative models for this task, they have typically been evaluated in simplified scenarios and for single particles. This is particularly true for the challenging task of highly granular calorimeter simulation. For the first time, this work studies the use of highly granular generative calorimeter surrogates in a realistic simulation application. We introduce DDML, a generic library which enables the combination of generative calorimeter surrogates with realistic detectors implemented using the DD4hep toolkit. We compare two different generative models - one operating on a regular grid representation, and the other using a less common point cloud approach. In order to disentangle methodological details from model performance, we provide comparisons to idealized simulators which directly sample representations of different resolutions from the full simulation ground-truth. We then systematically evaluate model performance on post-reconstruction benchmarks for electromagnetic shower simulation. Beginning with a typical single particle study, we introduce a first multi-particle benchmark based on di-photon separations, before studying a first full-physics benchmark based on hadronic decays of the tau lepton. Our results indicate that models operating on a point cloud can achieve a favorable balance between speed and accuracy for highly granular calorimeter simulation compared to those which operate on a regular grid representation.
