One flow to correct them all: improving simulations in high-energy physics with a single normalising flow and a switch
Caio Cesar Daumann, Mauro Donega, Johannes Erdmann, Massimiliano Galli, Jan Lukas Späh, Davide Valsecchi
TL;DR
The paper tackles mismodellings in Monte Carlo simulations used in high-energy physics by introducing a morphing method based on a single normalising flow conditioned on a boolean IsData. The flow learns a shared base distribution for data and simulation, enabling quantile morphing that maps simulation samples to data space after flipping the conditioning and applying the inverse transform. Validated on both two-dimensional benchmarks and a physics-inspired toy dataset with non-trivial correlations, the approach achieves 1–2% agreement in marginals and substantially improves correlation structure, while rendering data and corrected simulation nearly indistinguishable to a boosted decision tree classifier. The method is simple to train, robust across ancillary variables, and extendable to multi-domain morphing, offering a broadly applicable tool for data-driven MC corrections in high-energy physics and related fields.
Abstract
Simulated events are key ingredients in almost all high-energy physics analyses. However, imperfections in the simulation can lead to sizeable differences between the observed data and simulated events. The effects of such mismodelling on relevant observables must be corrected either effectively via scale factors, with weights or by modifying the distributions of the observables and their correlations. We introduce a correction method that transforms one multidimensional distribution (simulation) into another one (data) using a simple architecture based on a single normalising flow with a boolean condition. We demonstrate the effectiveness of the method on a physics-inspired toy dataset with non-trivial mismodelling of several observables and their correlations.
