Table of Contents
Fetching ...

Handling uncertainties in background shapes: the discrete profiling method

P. D. Dauncey, M. Kenzie, N. Wardle, G. J. Davies

TL;DR

The paper tackles uncertainties that arise when the background shape is unknown and cannot be anchored by theory or simulation. It introduces discrete profiling, treating the background function choice as a discrete nuisance parameter and using an envelope of profile-likelihood curves across candidate functions to obtain a robust overall inference. Through toy studies and a CMS Higgs to gamma gamma-like example, it demonstrates good coverage and small bias for the envelope approach, and analyzes penalties for comparing functions with different numbers of parameters. It concludes that a p-value-based correction is practical and effective, enabling broader application of discrete profiling to problems with unknown background shapes.

Abstract

A common problem in data analysis is that the functional form, as well as the parameter values, of the underlying model which should describe a dataset is not known a priori. In these cases some extra uncertainty must be assigned to the extracted parameters of interest due to lack of exact knowledge of the functional form of the model. A method for assigning an appropriate error is presented. The method is based on considering the choice of functional form as a discrete nuisance parameter which is profiled in an analogous way to continuous nuisance parameters. The bias and coverage of this method are shown to be good when applied to a realistic example.

Handling uncertainties in background shapes: the discrete profiling method

TL;DR

The paper tackles uncertainties that arise when the background shape is unknown and cannot be anchored by theory or simulation. It introduces discrete profiling, treating the background function choice as a discrete nuisance parameter and using an envelope of profile-likelihood curves across candidate functions to obtain a robust overall inference. Through toy studies and a CMS Higgs to gamma gamma-like example, it demonstrates good coverage and small bias for the envelope approach, and analyzes penalties for comparing functions with different numbers of parameters. It concludes that a p-value-based correction is practical and effective, enabling broader application of discrete profiling to problems with unknown background shapes.

Abstract

A common problem in data analysis is that the functional form, as well as the parameter values, of the underlying model which should describe a dataset is not known a priori. In these cases some extra uncertainty must be assigned to the extracted parameters of interest due to lack of exact knowledge of the functional form of the model. A method for assigning an appropriate error is presented. The method is based on considering the choice of functional form as a discrete nuisance parameter which is profiled in an analogous way to continuous nuisance parameters. The bias and coverage of this method are shown to be good when applied to a realistic example.

Paper Structure

This paper contains 15 sections, 4 equations, 16 figures.

Figures (16)

  • Figure 1: Illustration of construction of the envelope (green, dot-dashed line) by choosing several fixed values of the nuisance parameters (red, dashed lines) when performing a profile likelihood scan for a variable of interest $x$. The $\Lambda$ profile curve for the nuisance parameters fixed to the best fit values is shown by the blue, solid line. The full profile curve allowing the nuisance parameters to be fitted for every value of $x$ is shown by the black, solid line. The red dashed lines show the $\Lambda$ curves for fixed nuisance parameter values other than those at the best fit, while the green dashed line is the envelope, i.e. the lowest value of any of the red dashed curves for each $x$. Even with such a coarse sampling of the nuisance parameters, it is seen that the envelope approximates the full profile curve.
  • Figure 2: Best fits of the four two-parameter functions (described in the text). The Laurent function is effectively identical to the power law function and so is hidden underneath the power law line. Note, for clarity in this plot, the data have been rebinned into 40 bins, although the fits were performed with a finer binning of 160 bins.
  • Figure 3: Profile $\Lambda$ scans for the four two-parameter functions discussed in the text. The polynomial function is above the top of the $\Lambda$ scale for all $\mu$ values shown in this figure.
  • Figure 4: Profile $\Lambda$ envelope for the four two-parameter function fits. The coloured bands indicate the 68.3% and 95.4% intervals determined from the regions for which the value of $\Lambda$ increases by 1 and 4 units from the minimum value as indicated by the horizontal lines. The dashed red line shows the profile $\Lambda$ curve which would be obtained using just the power law function.
  • Figure 5: Distribution of $\Delta\Lambda\xspace$, the difference between the $\Lambda$ value with $\mu$ fixed to its true value and $\Lambda$ at the best fit value of $\mu$. These values are from fits to toy datasets generated with $\mu=1$ and with the power law function, for which the parameters are fixed to the best fit values as described in the previous section. The black data points are from the toy dataset fits and the green function shows the expected $\chi^2$ distribution for one degree of freedom. The blue and red histograms show the distributions from the toys separated into cases where the best fit uses the same or different functions, respectively, compared with the function used to generate the toys. The power law (same function) was the best fit function in 34.1% of the toys, while the Laurent and exponential were the best fit functions in 36.5% and 29.4% of the toys, respectively. The dashed red and blue lines are $\chi^{2}$ distributions with one degree of freedom normalized to the number of toys in the red and blue histograms respectively.
  • ...and 11 more figures