The Landscape of Unfolding with Machine Learning
Nathan Huetsch, Javier Mariño Villadamigo, Alexander Shmakov, Sascha Diefenbacher, Vinicius Mikuni, Theo Heimel, Michael Fenton, Kevin Greif, Benjamin Nachman, Daniel Whiteson, Anja Butter, Tilman Plehn
TL;DR
The paper tackles the challenge of unfolding detector effects and translating observations to parton-level information in high-energy physics by surveying three ML-based families: reweighting (OmniFold and Bayesian variants), distribution mapping (Schrödinger Bridge and Direct Diffusion), and conditional generative unfolding (cINN, Transfermer, CFM, TraCFM, Latent Diffusion). By benchmarking these methods on identical datasets, the authors demonstrate that each approach can reproduce particle- and parton-level distributions with percent-level accuracy across complex observables, while offering complementary strengths and uncertainty quantification. The study shows practical viability for unbinned, high-dimensional cross-section measurements, enabling broader community access and potential sensitivity to new phenomena, with concrete extensions to $Z$+jets detector unfolding and top-quark pair production. The results suggest a versatile ML toolkit for future SM tests and global analyses, combining model-agnostic reweighting, distribution-mapping, and physics-informed generative modeling. These advances have significant practical impact by reducing reliance on expensive forward simulations and enabling precise, multi-dimensional unfolding in contemporary collider data analysis.
Abstract
Recent innovations from machine learning allow for data unfolding, without binning and including correlations across many dimensions. We describe a set of known, upgraded, and new methods for ML-based unfolding. The performance of these approaches are evaluated on the same two datasets. We find that all techniques are capable of accurately reproducing the particle-level spectra across complex observables. Given that these approaches are conceptually diverse, they offer an exciting toolkit for a new class of measurements that can probe the Standard Model with an unprecedented level of detail and may enable sensitivity to new phenomena.
