Table of Contents
Fetching ...

Profiling systematic uncertainties in Simulation-Based Inference with Factorizable Normalizing Flows

Davide Valsecchi, Mauro Donegà, Rainer Wallny

TL;DR

The paper tackles the challenge of profiling high-dimensional systematic uncertainties in unbinned likelihood analyses by introducing a Simulation-Based Inference framework based on Factorizable Normalizing Flows (FNF). It defines Distributions of Interest (DoI) as learnable invertible transformations of the feature space, enabling functional, distribution-wide measurements beyond scalar parameters. A two-tier approach combines a nominal density with a modular, linear/quadratic deformation for nuisances, together with amortized training that maps nuisance configurations to DoI deformations, and an orthogonal decomposition to interpret dominant uncertainty modes. The method is validated on a synthetic high-energy physics-like dataset, showing scalable profiling, accurate DoI recovery, and robust uncertainty quantification, with potential applications to unfolding and differential cross-section measurements in complex analyses.

Abstract

Unbinned likelihood fits aim at maximizing the information one can extract from experimental data, yet their application in realistic statistical analyses is often hindered by the computational cost of profiling systematic uncertainties. Additionally, current machine learning-based inference methods are typically limited to estimating scalar parameters in a multidimensional space rather than full differential distributions. We propose a general framework for Simulation-Based Inference (SBI) that efficiently profiles nuisance parameters while measuring multivariate Distributions of Interest (DoI), defined as learnable invertible transformations of the feature space. We introduce Factorizable Normalizing Flows to model systematic variations as parametric deformations of a nominal density, preserving tractability without combinatorial explosion. Crucially, we develop an amortized training strategy that learns the conditional dependence of the DoI on nuisance parameters in a single optimization process, bypassing the need for repetitive training during the likelihood scan. This allows for the simultaneous extraction of the underlying distribution and the robust profiling of nuisances. The method is validated on a synthetic dataset emulating a high-energy physics measurement with multiple systematic sources, demonstrating its potential for unbinned, functional measurements in complex analyses.

Profiling systematic uncertainties in Simulation-Based Inference with Factorizable Normalizing Flows

TL;DR

The paper tackles the challenge of profiling high-dimensional systematic uncertainties in unbinned likelihood analyses by introducing a Simulation-Based Inference framework based on Factorizable Normalizing Flows (FNF). It defines Distributions of Interest (DoI) as learnable invertible transformations of the feature space, enabling functional, distribution-wide measurements beyond scalar parameters. A two-tier approach combines a nominal density with a modular, linear/quadratic deformation for nuisances, together with amortized training that maps nuisance configurations to DoI deformations, and an orthogonal decomposition to interpret dominant uncertainty modes. The method is validated on a synthetic high-energy physics-like dataset, showing scalable profiling, accurate DoI recovery, and robust uncertainty quantification, with potential applications to unfolding and differential cross-section measurements in complex analyses.

Abstract

Unbinned likelihood fits aim at maximizing the information one can extract from experimental data, yet their application in realistic statistical analyses is often hindered by the computational cost of profiling systematic uncertainties. Additionally, current machine learning-based inference methods are typically limited to estimating scalar parameters in a multidimensional space rather than full differential distributions. We propose a general framework for Simulation-Based Inference (SBI) that efficiently profiles nuisance parameters while measuring multivariate Distributions of Interest (DoI), defined as learnable invertible transformations of the feature space. We introduce Factorizable Normalizing Flows to model systematic variations as parametric deformations of a nominal density, preserving tractability without combinatorial explosion. Crucially, we develop an amortized training strategy that learns the conditional dependence of the DoI on nuisance parameters in a single optimization process, bypassing the need for repetitive training during the likelihood scan. This allows for the simultaneous extraction of the underlying distribution and the robust profiling of nuisances. The method is validated on a synthetic dataset emulating a high-energy physics measurement with multiple systematic sources, demonstrating its potential for unbinned, functional measurements in complex analyses.
Paper Structure (38 sections, 33 equations, 14 figures)

This paper contains 38 sections, 33 equations, 14 figures.

Figures (14)

  • Figure 1: Conceptual overview of the proposed framework. The method consists of learning a Distribution of Interest (DoI) $T_{\phi}^{\hat{\nu}}$ that maps the nominal reference model to the observed data, while simultaneously learning a Systematic Transformation$T_{\psi_f}$ that captures the effect of nuisance parameters as deformations of the feature space. The training is performed in an amortized fashion by sampling nuisance parameters from their prior distribution and evaluating the extended likelihood $\mathcal{L}_{\text{ext}}$ in each point, allowing for a global optimization of the nuisance dependence.
  • Figure 2: Schematic representation of the Factorizable Normalizing Flow layer. The shift $s$ and scale $t$ parameters are computed as a sum of independent contributions from each systematic uncertainty $\nu_k$. Each contribution is parameterized as a quadratic function of $\nu_k$, with coefficients learned by a dedicated Masked MLP $\Psi^{(k)}$ conditioned on the inputs.
  • Figure 3: Residual transformation $T_{\nu_1}(y|x, f)$ learned by the Factorizable Normalizing Flow to capture the effect of the $\nu_1$ systematic variation on the input feature distribution $y$. The residuals are shown as a function of the kinematic variable $x$ for each flavour $f$. The transformation effectively captures the shape variations induced by the systematic uncertainty, allowing for accurate modeling of its impact on the likelihood fit.
  • Figure 4: Schematic of the proposed training procedure. First a global minimum is found by jointly optimizing the nominal transformation $T_{\phi_f}$ and nuisance parameters $\nu$. Then, the amortized training for systematics profiling starts. The nuisance configuration $\nu_j$ is sampled from the likelihood space (top left). The "Systematic Map" network $T_{\psi_f}$ takes this $\nu_j$ as input (conditioning) along with the data. The gradient descent step updates the network parameters to maximize the likelihood specifically at this sampled point $\nu_j$, effectively teaching the network to approximate the profile likelihood curve across the entire nuisance space.
  • Figure 5: Structure of the synthetic dataset. Top row: Distribution of events in the kinematic space $x$ for classes A and B, Distribution of the feature space $y$ for the nominal model (center) and distorted one (right). Bottom row: Distribution of events in the feature space $y$ for different values of $x$.
  • ...and 9 more figures