Table of Contents
Fetching ...

A response-matrix-centred approach to presenting cross-section measurements

Lukas Koch

TL;DR

The paper tackles the ill-posed problem of unfolding by introducing a forward-folding, response-matrix–centred method that maps truth-space expectations $\mu_j$ to reconstructed counts $\nu_i$ through a detector response matrix $R_{ij}$ and compares predictions directly in reconstructed space. It handles detector uncertainties by generating toy matrices $R^t$ from nuisance distributions, enabling marginalisation or profiling in the likelihood and accommodating backgrounds via truth-space or reco-space treatments. A Bayesian-inspired, three-step approach builds the matrices from MC with priors on efficiency and smearing, propagating statistical and systematic uncertainties into $R^t_{ij}$; the framework supports both marginal and profile likelihoods. The implementation is realized in the ReMU software, which provides tools to construct, store, and use response matrices with standard Python libraries, promoting model-independence and accessibility for non-collaborators to test models against published data. Together, these contributions offer a scalable, detector-agnostic alternative to unfolding for cross-section measurements with broad usability.

Abstract

The current canonical approach to publishing cross-section data is to unfold the reconstructed distributions. Detector effects like efficiency and smearing are undone mathematically, yielding distributions in true event properties. This is an ill-posed problem, as even small statistical variations in the reconstructed data can lead to large changes in the unfolded spectra. This work presents an alternative or complementary approach: the response-matrix-centred forward-folding approach. It offers a convenient way to forward-fold model expectations in truth space to reconstructed quantities. These can then be compared to the data directly, similar to what is usually done with full detector simulations within the experimental collaborations. For this, the detector response (efficiency and smearing) is parametrised as a matrix. The effects of the detector on the measurement of a given model is simulated by simply multiplying the binned truth expectation values by this response matrix. Systematic uncertainties in the detector response are handled by providing a set of matrices according to the prior distribution of the detector properties and marginalising over them. Background events can be included in the likelihood calculation by giving background events their own bins in truth space. To facilitate a straight-forward use of response matrices, a new software framework has been developed: the Response Matrix Utilities (ReMU). ReMU is a Python package distributed via the Python Package Index. It only uses widely available, standard scientific Python libraries and does not depend on any custom experiment-specific software. It offers all methods needed to build response matrices from Monte Carlo data sets, use the response matrix to forward-fold truth-level model predictions, and compare the predictions to real data using Bayesian or frequentist statistical inference.

A response-matrix-centred approach to presenting cross-section measurements

TL;DR

The paper tackles the ill-posed problem of unfolding by introducing a forward-folding, response-matrix–centred method that maps truth-space expectations to reconstructed counts through a detector response matrix and compares predictions directly in reconstructed space. It handles detector uncertainties by generating toy matrices from nuisance distributions, enabling marginalisation or profiling in the likelihood and accommodating backgrounds via truth-space or reco-space treatments. A Bayesian-inspired, three-step approach builds the matrices from MC with priors on efficiency and smearing, propagating statistical and systematic uncertainties into ; the framework supports both marginal and profile likelihoods. The implementation is realized in the ReMU software, which provides tools to construct, store, and use response matrices with standard Python libraries, promoting model-independence and accessibility for non-collaborators to test models against published data. Together, these contributions offer a scalable, detector-agnostic alternative to unfolding for cross-section measurements with broad usability.

Abstract

The current canonical approach to publishing cross-section data is to unfold the reconstructed distributions. Detector effects like efficiency and smearing are undone mathematically, yielding distributions in true event properties. This is an ill-posed problem, as even small statistical variations in the reconstructed data can lead to large changes in the unfolded spectra. This work presents an alternative or complementary approach: the response-matrix-centred forward-folding approach. It offers a convenient way to forward-fold model expectations in truth space to reconstructed quantities. These can then be compared to the data directly, similar to what is usually done with full detector simulations within the experimental collaborations. For this, the detector response (efficiency and smearing) is parametrised as a matrix. The effects of the detector on the measurement of a given model is simulated by simply multiplying the binned truth expectation values by this response matrix. Systematic uncertainties in the detector response are handled by providing a set of matrices according to the prior distribution of the detector properties and marginalising over them. Background events can be included in the likelihood calculation by giving background events their own bins in truth space. To facilitate a straight-forward use of response matrices, a new software framework has been developed: the Response Matrix Utilities (ReMU). ReMU is a Python package distributed via the Python Package Index. It only uses widely available, standard scientific Python libraries and does not depend on any custom experiment-specific software. It offers all methods needed to build response matrices from Monte Carlo data sets, use the response matrix to forward-fold truth-level model predictions, and compare the predictions to real data using Bayesian or frequentist statistical inference.

Paper Structure

This paper contains 21 sections, 61 equations, 4 figures.

Figures (4)

  • Figure 1: The response-matrix-centred approach. The aim of the approach presented here is to replace the computationally intensive full detector simulation (top) with a much simpler matrix multiplication (centre). This would allow a much faster test of different models against the data (bottom).
  • Figure 2: Profile likelihood and unbounded detector parameters. If the best possible fit value for a detector parameter lies outside the expected range, this can led to unwanted effects in combination with the use of a profile likelihood. The best fit value (red) keeps increasing with the number of random evaluations in the case of the normally distributed parameter (blue). When the parameter uncertainty is assumed to be a bound uniform distribution, the value approaches a limiting value much quicker (orange).
  • Figure 3: Forward-folded background. Background processes get their own separated binning in truth space. Future changes in the modeling of the background are possible. Data releases can include templates of the background distributions, so users of the response matrix will not have to provide their own background estimates.
  • Figure 4: Template background. The reco-space shape of the background is stored as columns in the response matrix. The corresponding truth bins decide the strength/weight of the background. Future changes in the modeling of the background are not possible. Only the weights can be varied.