Table of Contents
Fetching ...

Under-coverage in high-statistics counting experiments with finite MC samples

Cristina-Andreea Alexe, Joshua Bendavid, Lorenzo Bianchini, Davide Bruschini

TL;DR

The paper investigates confidence-interval coverage for a parameter of interest in high-statistics, binned counting experiments where MC-driven templates are finite. It demonstrates that standard asymptotic CI methods based on Wilks’ theorem or Hessian matrices can exhibit systematic under-coverage due to MC fluctuations and nuisance parameters, even with large data samples. By analyzing a paradigmatic toy model and then generalizing to the full MC-uncertainty framework, the authors show that biases arising from fluctuations in both the Jacobian blocks ${f b}$ and ${f A}$ can distort the profiled likelihood and undercut coverage. A practical heuristic interval and a scaling-based diagnostic are proposed to gauge and mitigate these effects, while highlighting that the correct likelihood is the full Barlow-Beeston form; in many realistic settings, asymptotic formulas may require substantial MC-sample augmentation or alternative interval constructions to ensure reliable inference.

Abstract

We consider the problem of setting confidence intervals on a parameter of interest from the maximum-likelihood fit of a physics model to a binned data set with a large number of bins, large event-counts per bin, and in the presence of systematic uncertainties modeled as nuisance parameters. We use the profile-likelihood ratio for statistical inference and focus on the case in which the model is determined from Monte Carlo simulated samples of finite size. We start by presenting a toy model in which the properties of widely used approximations of the profile-likelihood ratio in the asymptotic limit, which are commonly expected to hold in the high-statistics regime, are manifestly broken even if the numbers of events per bin in both the data and simulated samples are seemingly large enough to warrant their validity. We then move to the general setting to show how statistical uncertainties in the Monte Carlo predictions can affect the coverage of confidence intervals constructed in the asymptotic approximation always in the same direction, namely they lead to systematic under-coverage.

Under-coverage in high-statistics counting experiments with finite MC samples

TL;DR

The paper investigates confidence-interval coverage for a parameter of interest in high-statistics, binned counting experiments where MC-driven templates are finite. It demonstrates that standard asymptotic CI methods based on Wilks’ theorem or Hessian matrices can exhibit systematic under-coverage due to MC fluctuations and nuisance parameters, even with large data samples. By analyzing a paradigmatic toy model and then generalizing to the full MC-uncertainty framework, the authors show that biases arising from fluctuations in both the Jacobian blocks and can distort the profiled likelihood and undercut coverage. A practical heuristic interval and a scaling-based diagnostic are proposed to gauge and mitigate these effects, while highlighting that the correct likelihood is the full Barlow-Beeston form; in many realistic settings, asymptotic formulas may require substantial MC-sample augmentation or alternative interval constructions to ensure reliable inference.

Abstract

We consider the problem of setting confidence intervals on a parameter of interest from the maximum-likelihood fit of a physics model to a binned data set with a large number of bins, large event-counts per bin, and in the presence of systematic uncertainties modeled as nuisance parameters. We use the profile-likelihood ratio for statistical inference and focus on the case in which the model is determined from Monte Carlo simulated samples of finite size. We start by presenting a toy model in which the properties of widely used approximations of the profile-likelihood ratio in the asymptotic limit, which are commonly expected to hold in the high-statistics regime, are manifestly broken even if the numbers of events per bin in both the data and simulated samples are seemingly large enough to warrant their validity. We then move to the general setting to show how statistical uncertainties in the Monte Carlo predictions can affect the coverage of confidence intervals constructed in the asymptotic approximation always in the same direction, namely they lead to systematic under-coverage.
Paper Structure (32 sections, 63 equations, 5 figures, 14 tables)

This paper contains 32 sections, 63 equations, 5 figures, 14 tables.

Figures (5)

  • Figure 1: Examples of templates $T_{ji}$ generated as described in the text for $N=2\times 10^6$, $\epsilon=0.03$, $n=200$, and for two values of $k_1=k_2=k$, namely $k=1$ (left) and $k=10$ (right). The horizontal dashed lines in each plot provide the expected number of events per bin for the signal and background processes.
  • Figure 2: Sampling distribution of $t_\mu$, the profile-likelihood function in the Gaussian approximation, evaluated at $\mu \equiv \mu^\prime_1=0$ for data distributed according to the nominal model (i.e. with $\mu^\prime_{\rm t}=0$) with $N=2\times 10^6$, $\epsilon=0.03$, $n=200$, and $k=1$, obtained from an ensemble of $10^4$ identically repeated pseudo-experiments.
  • Figure 3: Sampling distributions of the point-estimator $\hat{\mu} \equiv \hat{\mu}^\prime_1$ and of its Hessian uncertainty $\hat{\sigma}_{\rm H}$ obtained from the Gauss + Barlow-Beeston likelihood applied to data distributed according to the nominal model (i.e. with $\mu_{\rm t}=0$) with $N=2\times 10^6$, $\epsilon=0.03$, $n=200$, and $k=1$, obtained from an ensemble of $10^4$ identically repeated pseudo-experiments. For both distributions, a Gaussian fit is overlaid for reference.
  • Figure 4: A sketch showing the effect of statistical fluctuations on the relevant quantities introduced in the text. The simple case $n=2$ and $p=1$ has been considered.
  • Figure 5: The ratio between the expectation value $\langle \hat{S} \rangle$ and the true value $S$ for the two-dimensional problem of eq. \ref{['eq:simple']} as a function of the squared correlation coefficient $\rho^2_\mu$, and for three representative values of $\langle\alpha^2\rangle$.