Kernel Estimation in High-Energy Physics
Kyle S. Cranmer
TL;DR
This work surveys kernel estimation as an unbinned, non-parametric density-estimation framework tailored for high-energy physics. It covers univariate and multivariate theory, including fixed and adaptive bandwidths, boundary handling, covariance considerations, and event weighting, then demonstrates broad applications from confidence level calculations to discriminant analysis and cut optimization. The paper also catalogs available software packages (KEYS, HEPUKeys, PDE, RootPDE, WinPDE) and contrasts kernel methods with SMOOTH, addressing systematic errors and practical adoption. Together, these insights provide a practical, theory-grounded toolkit for more flexible density estimation in HEP analyses and emphasize reduced binning artifacts and better handling of boundaries and heterogeneous data. The methodological emphasis and packaging guidance aim to accelerate adoption of kernel-estimation techniques in diverse physics analyses with improved accuracy and interpretability.
Abstract
Kernel Estimation provides an unbinned and non-parametric estimate of the probability density function from which a set of data is drawn. In the first section, after a brief discussion on parametric and non-parametric methods, the theory of Kernel Estimation is developed for univariate and multivariate settings. The second section discusses some of the applications of Kernel Estimation to high-energy physics. The third section provides an overview of the available univariate and multivariate packages. This paper concludes with a discussion of the inherent advantages of kernel estimation techniques and systematic errors associated with the estimation of parent distributions.
