Efficient Estimation of Unfactorizable Systematic Uncertainties
Alexis Romero, Kyle Cranmer, Daniel Whiteson
TL;DR
This work addresses the challenge of estimating jointly unfactorizable systematic uncertainties in high-dimensional collider data. It adopts Gaussian process regression with derivative information to model observables as functions of nuisance parameters, and employs a Bayesian Experimental Design strategy to efficiently select training points that minimize predictive uncertainty. Across a simple 1D toy model and 2D/4D high-energy physics efficiency problems, derivative GPs consistently outperform regular GPs and both beat factorization-based baselines, with BED significantly reducing the required number of samples. The approach offers a scalable, nonparametric framework for robust uncertainty quantification in complex experimental settings, improving accuracy and reducing computational burden for precision tests of the Standard Model.
Abstract
Accurate assessment of systematic uncertainties is an increasingly vital task in physics studies, where large, high-dimensional datasets, like those collected at the Large Hadron Collider, hold the key to new discoveries. Common approaches to assessing systematic uncertainties rely on simplifications, such as assuming that the impact of the various sources of uncertainty factorizes. In this paper, we provide realistic example scenarios in which this assumption fails. We introduce an algorithm that uses Gaussian process regression to estimate the impact of systematic uncertainties \textit{without} assuming factorization. The Gaussian process models are enhanced with derivative information, which increases the accuracy of the regression without increasing the number of samples. In addition, we present a novel sampling strategy based on Bayesian experimental design, which is shown to be more efficient than random and grid sampling in our example scenarios.
