Table of Contents
Fetching ...

CaloClouds II: Ultra-Fast Geometry-Independent Highly-Granular Calorimeter Simulation

Erik Buhmann, Frank Gaede, Gregor Kasieczka, Anatolii Korol, William Korcari, Katja Krüger, Peter McKeown

TL;DR

We address the need for ultra-fast, high-fidelity simulations of energy depositions in highly granular calorimeters by introducing CaloClouds II, a geometry-independent point-cloud diffusion model that leverages continuous-time EDM diffusion and a consistency distillation to single-step generation. The method removes the latent space, expands a Shower Flow to predict shower-wide properties, and distills the diffusion model into a single-evaluation consistency model, achieving up to $46\times$ CPU speed-up (and up to $1873\times$ on GPU) over Geant4. Across physics observables and high-level evaluations, CaloClouds II variants closely match Geant4, with the CM variant offering the best fidelity-speed trade-off and marking the first application of consistency distillation to calorimeter showers. These advances enable practical deployment of fast, geometry-agnostic calorimeter simulations in future collider workflows and provide a foundation for further fidelity improvements and geometry generalization.

Abstract

Fast simulation of the energy depositions in high-granular detectors is needed for future collider experiments with ever-increasing luminosities. Generative machine learning (ML) models have been shown to speed up and augment the traditional simulation chain in physics analysis. However, the majority of previous efforts were limited to models relying on fixed, regular detector readout geometries. A major advancement is the recently introduced CaloClouds model, a geometry-independent diffusion model, which generates calorimeter showers as point clouds for the electromagnetic calorimeter of the envisioned International Large Detector (ILD). In this work, we introduce CaloClouds II which features a number of key improvements. This includes continuous time score-based modelling, which allows for a 25-step sampling with comparable fidelity to CaloClouds while yielding a $6\times$ speed-up over Geant4 on a single CPU ($5\times$ over CaloClouds). We further distill the diffusion model into a consistency model allowing for accurate sampling in a single step and resulting in a $46\times$ ($37\times$ over CaloClouds) speed-up. This constitutes the first application of consistency distillation for the generation of calorimeter showers.

CaloClouds II: Ultra-Fast Geometry-Independent Highly-Granular Calorimeter Simulation

TL;DR

We address the need for ultra-fast, high-fidelity simulations of energy depositions in highly granular calorimeters by introducing CaloClouds II, a geometry-independent point-cloud diffusion model that leverages continuous-time EDM diffusion and a consistency distillation to single-step generation. The method removes the latent space, expands a Shower Flow to predict shower-wide properties, and distills the diffusion model into a single-evaluation consistency model, achieving up to CPU speed-up (and up to on GPU) over Geant4. Across physics observables and high-level evaluations, CaloClouds II variants closely match Geant4, with the CM variant offering the best fidelity-speed trade-off and marking the first application of consistency distillation to calorimeter showers. These advances enable practical deployment of fast, geometry-agnostic calorimeter simulations in future collider workflows and provide a foundation for further fidelity improvements and geometry generalization.

Abstract

Fast simulation of the energy depositions in high-granular detectors is needed for future collider experiments with ever-increasing luminosities. Generative machine learning (ML) models have been shown to speed up and augment the traditional simulation chain in physics analysis. However, the majority of previous efforts were limited to models relying on fixed, regular detector readout geometries. A major advancement is the recently introduced CaloClouds model, a geometry-independent diffusion model, which generates calorimeter showers as point clouds for the electromagnetic calorimeter of the envisioned International Large Detector (ILD). In this work, we introduce CaloClouds II which features a number of key improvements. This includes continuous time score-based modelling, which allows for a 25-step sampling with comparable fidelity to CaloClouds while yielding a speed-up over Geant4 on a single CPU ( over CaloClouds). We further distill the diffusion model into a consistency model allowing for accurate sampling in a single step and resulting in a ( over CaloClouds) speed-up. This constitutes the first application of consistency distillation for the generation of calorimeter showers.
Paper Structure (13 sections, 6 equations, 7 figures, 4 tables)

This paper contains 13 sections, 6 equations, 7 figures, 4 tables.

Figures (7)

  • Figure 1: Illustration of the training and sampling procedure of the CaloClouds II model. (a) During training a random continuous time step $t$ is trained conditioned on the shower energy $E$ and number of points $N$. The loss, $L_\text{MSE}$, is approximated by a simple mean squared error (MSE) between the noised data and the denoised output. The scaling functions $c_\text{in}$, $c_\text{out}$, and $c_\text{skip}$ are defined following Eq. \ref{['eq:scaling']}. (b) During sampling the $E$-conditional Shower Flow generates $N$ as well as shower observables for calibration. After a $N$ calibration the PointWise Net denoises iteratively noise $\mathcal{N}(\boldsymbol{0}, T^2 \boldsymbol{I})$ into a calorimeter shower. When sampling with CaloClouds II (CM) only one denoising step is performed.
  • Figure 2: Illustration of the consistency distillation process distilling the diffusion model of CaloClouds II (teacher model) into a consistency model (student and target model). The student model is updated via gradient descent and the target model is updated as an exponential moving average of the student model weights.
  • Figure 3: Histogram of the cell energies (left), radial shower profile (center), and longitudinal shower profile (right) for Geant4, CaloClouds, CaloClouds II, and CaloClouds II (CM). In the cell energy distribution, the region below 0.1 MeV is grayed out (see main text for details). All distributions are calculated with 40,000 events sampled with a uniform distribution of incident particle energies between 10 and 90 GeV. The bottom panel provides the ratio to Geant4. The error band corresponds to the statistical uncertainty in each bin.
  • Figure 4: Position of the center of gravity of showers along the $X$ (left), $Y$ (center), and $Z$ (right) directions. All distributions are calculated for 40,000 showers with a uniform distribution of incident particle energies between 10 and 90 GeV. The error band corresponds to the statistical uncertainty in each bin.
  • Figure 5: Visible energy sum (left) and the number of hits (right) distributions, for 10, 50, and 90 GeV showers. For each energy and model, 2,000 showers are shown. The error band corresponds to the statistical uncertainty in each bin.
  • ...and 2 more figures