Table of Contents
Fetching ...

A First Full Physics Benchmark for Highly Granular Calorimeter Surrogates

Thorsten Buss, Henry Day-Hall, Frank Gaede, Gregor Kasieczka, Katja Krüger, Anatolii Korol, Thomas Madlener, Peter McKeown

TL;DR

The paper presents the first full-physics benchmark for highly granular calorimeter surrogates integrated into realistic detector simulations via the DDML library on DD4hep. It contrasts two advanced surrogates, ConvL2LFlows (grid-based) and CaloClouds (point-cloud), across multiple shower representations and benchmarks—single photons, di-photon separations, and tau hadronic decays—against Geant4. Results show that CaloClouds delivers substantial speedups with fidelity closely approaching ideal references, while grid-based methods suffer from representation-induced artifacts and limited containment, especially at higher energies. The work demonstrates that carefully chosen shower representations and tight integration into production software are critical to translating surrogate models into practical, reconstruction-level physics analyses, with region-specific calibration offering a path to further improvements.

Abstract

The physics programs of current and future collider experiments necessitate the development of surrogate simulators for calorimeter showers. While much progress has been made in the development of generative models for this task, they have typically been evaluated in simplified scenarios and for single particles. This is particularly true for the challenging task of highly granular calorimeter simulation. For the first time, this work studies the use of highly granular generative calorimeter surrogates in a realistic simulation application. We introduce DDML, a generic library which enables the combination of generative calorimeter surrogates with realistic detectors implemented using the DD4hep toolkit. We compare two different generative models - one operating on a regular grid representation, and the other using a less common point cloud approach. In order to disentangle methodological details from model performance, we provide comparisons to idealized simulators which directly sample representations of different resolutions from the full simulation ground-truth. We then systematically evaluate model performance on post-reconstruction benchmarks for electromagnetic shower simulation. Beginning with a typical single particle study, we introduce a first multi-particle benchmark based on di-photon separations, before studying a first full-physics benchmark based on hadronic decays of the tau lepton. Our results indicate that models operating on a point cloud can achieve a favorable balance between speed and accuracy for highly granular calorimeter simulation compared to those which operate on a regular grid representation.

A First Full Physics Benchmark for Highly Granular Calorimeter Surrogates

TL;DR

The paper presents the first full-physics benchmark for highly granular calorimeter surrogates integrated into realistic detector simulations via the DDML library on DD4hep. It contrasts two advanced surrogates, ConvL2LFlows (grid-based) and CaloClouds (point-cloud), across multiple shower representations and benchmarks—single photons, di-photon separations, and tau hadronic decays—against Geant4. Results show that CaloClouds delivers substantial speedups with fidelity closely approaching ideal references, while grid-based methods suffer from representation-induced artifacts and limited containment, especially at higher energies. The work demonstrates that carefully chosen shower representations and tight integration into production software are critical to translating surrogate models into practical, reconstruction-level physics analyses, with region-specific calibration offering a path to further improvements.

Abstract

The physics programs of current and future collider experiments necessitate the development of surrogate simulators for calorimeter showers. While much progress has been made in the development of generative models for this task, they have typically been evaluated in simplified scenarios and for single particles. This is particularly true for the challenging task of highly granular calorimeter simulation. For the first time, this work studies the use of highly granular generative calorimeter surrogates in a realistic simulation application. We introduce DDML, a generic library which enables the combination of generative calorimeter surrogates with realistic detectors implemented using the DD4hep toolkit. We compare two different generative models - one operating on a regular grid representation, and the other using a less common point cloud approach. In order to disentangle methodological details from model performance, we provide comparisons to idealized simulators which directly sample representations of different resolutions from the full simulation ground-truth. We then systematically evaluate model performance on post-reconstruction benchmarks for electromagnetic shower simulation. Beginning with a typical single particle study, we introduce a first multi-particle benchmark based on di-photon separations, before studying a first full-physics benchmark based on hadronic decays of the tau lepton. Our results indicate that models operating on a point cloud can achieve a favorable balance between speed and accuracy for highly granular calorimeter simulation compared to those which operate on a regular grid representation.

Paper Structure

This paper contains 29 sections, 1 equation, 15 figures, 1 table, 1 algorithm.

Figures (15)

  • Figure 1: Visualization of geometry maps for (left) the physical geometry and (right) the regularized geometry for a section of two sensitive layers in the calorimeter. The physical geometry includes gaps between the cells arising from insensitive volumes such as structural supports and readout electronics, as well as a staggering effect between layers. The regularized geometry consists purely of sensitive material, with the cells being perfectly aligned from one layer to the next. Figure from McKeown:2024.
  • Figure 2: Visualization of the same 90 GeV electromagnetic shower in lateral projection of the ILD ECAL, represented using the three optimal shower generators: Optimum (x1) (left), Optimum (x9) (center), and Optimum (steps) (right).
  • Figure 3: Radial (left) and longitudinal (right) energy profiles of electromagnetic showers, computed at the reconstruction level after integration of the generative models. The radial profile shows the mean reconstructed energy as a function of distance from the shower axis, while the longitudinal profile shows the mean reconstructed energy per calorimeter layer. These observables provide a detailed characterization of the transverse and longitudinal shower structure and are critical benchmarks for assessing how well generative models replicate Geant4 showers in realistic detector geometry settings. The color coding corresponds to the different generative models: CaloClouds3 (orange), ConvL2LFlows (violet), Optimum (x1) (blue), Optimum (x9) (green), and Optimum (steps) (cyan). The Geant4 reference is shown in the light grey filled histogram. The color coding is consistent across all figures in this section. Shaded bands indicate statistical uncertainties; lower panels show relative deviations with respect to the Geant4 baseline.
  • Figure 4: Radial energy profile of the showers, zoomed in to the first 30 mm from the shower axis. The shaded error bands correspond to the statistical uncertainty in each bin. The lower subplot shows the relative deviation of the radial energy profile with respect to the Geant4 reference. The color coding is consistent with Figure \ref{['fig:radial_longitudinal_energy']}.
  • Figure 5: Energy resolution (left) and linearity (right) of reconstructed photon showers in the ILD ECAL. The resolution is defined as the relative width $\sigma_{90}/\mu_{90}$ of the central 90% interval of the reconstructed energy distribution. The linearity is given by the mean $\mu_{90}$ of this central interval as a function of the incident photon energy.
  • ...and 10 more figures