Table of Contents
Fetching ...

A Principled Approach to Randomized Selection under Uncertainty: Applications to Peer Review and Grant Funding

Alexander Goldberg, Giulia Fanti, Nihar B. Shah

TL;DR

This paper tackles top-$k$ selection under uncertainty when only interval estimates of proposal quality are available, acknowledging Knightian uncertainty. It introduces MERIT, a Maximin Efficient Randomized Interval Top-$k) algorithm that solves a maximin ex ante objective over all rankings consistent with the quality intervals and enforces ex post validity to respect dominance relationships. The authors present a polynomial-time separation-oracle-based algorithm and a practical cutting-plane implementation that scales to over $10^4$ items, with post-processing guarantees that the resulting mechanism satisfies both ex ante optimality and ex post validity. Through axiomatic comparisons and extensive experiments on synthetic and real peer-review data, MERIT matches existing methods in expected utility under probabilistic models and significantly outperforms them in worst-case objective settings, while providing robust performance and interpretability. The work offers a principled, scalable framework for randomized funding and peer-review decision-making under uncertainty, with broad implications for policy and practice in grant allocation and high-stakes selection tasks.

Abstract

Many decision-making processes involve evaluating and then selecting items; examples include scientific peer review, job hiring, school admissions, and investment decisions. The eventual selection is performed by applying rules or deliberations to the raw evaluations, and then deterministically selecting the items deemed to be the best. These domains feature error-prone evaluations and uncertainty about future outcomes, which undermine the reliability of such deterministic selection rules. As a result, selection mechanisms involving explicit randomization that incorporate the uncertainty are gaining traction in practice. However, current randomization approaches are ad hoc, and as we prove, inappropriate for their purported objectives. In this paper, we propose a principled framework for randomized decision-making based on interval estimates of the quality of each item. We introduce MERIT (Maximin Efficient Randomized Interval Top-k), an optimization-based method that maximizes the worst-case expected number of top candidates selected, under uncertainty represented by overlapping intervals (e.g., confidence intervals or min-max intervals). MERIT provides an optimal resource allocation scheme under an interpretable notion of robustness. We develop a polynomial-time algorithm to solve the optimization problem and demonstrate empirically that the method scales to over 10,000 items. We prove that MERIT satisfies desirable axiomatic properties not guaranteed by existing approaches. Finally, we empirically compare algorithms on synthetic peer review data. Our experiments demonstrate that MERIT matches the performance of existing algorithms in expected utility under fully probabilistic review data models used in previous work, while outperforming previous methods with respect to our novel worst-case formulation.

A Principled Approach to Randomized Selection under Uncertainty: Applications to Peer Review and Grant Funding

TL;DR

This paper tackles top- selection under uncertainty when only interval estimates of proposal quality are available, acknowledging Knightian uncertainty. It introduces MERIT, a Maximin Efficient Randomized Interval Top-10^4$ items, with post-processing guarantees that the resulting mechanism satisfies both ex ante optimality and ex post validity. Through axiomatic comparisons and extensive experiments on synthetic and real peer-review data, MERIT matches existing methods in expected utility under probabilistic models and significantly outperforms them in worst-case objective settings, while providing robust performance and interpretability. The work offers a principled, scalable framework for randomized funding and peer-review decision-making under uncertainty, with broad implications for policy and practice in grant allocation and high-stakes selection tasks.

Abstract

Many decision-making processes involve evaluating and then selecting items; examples include scientific peer review, job hiring, school admissions, and investment decisions. The eventual selection is performed by applying rules or deliberations to the raw evaluations, and then deterministically selecting the items deemed to be the best. These domains feature error-prone evaluations and uncertainty about future outcomes, which undermine the reliability of such deterministic selection rules. As a result, selection mechanisms involving explicit randomization that incorporate the uncertainty are gaining traction in practice. However, current randomization approaches are ad hoc, and as we prove, inappropriate for their purported objectives. In this paper, we propose a principled framework for randomized decision-making based on interval estimates of the quality of each item. We introduce MERIT (Maximin Efficient Randomized Interval Top-k), an optimization-based method that maximizes the worst-case expected number of top candidates selected, under uncertainty represented by overlapping intervals (e.g., confidence intervals or min-max intervals). MERIT provides an optimal resource allocation scheme under an interpretable notion of robustness. We develop a polynomial-time algorithm to solve the optimization problem and demonstrate empirically that the method scales to over 10,000 items. We prove that MERIT satisfies desirable axiomatic properties not guaranteed by existing approaches. Finally, we empirically compare algorithms on synthetic peer review data. Our experiments demonstrate that MERIT matches the performance of existing algorithms in expected utility under fully probabilistic review data models used in previous work, while outperforming previous methods with respect to our novel worst-case formulation.

Paper Structure

This paper contains 43 sections, 12 theorems, 14 equations, 11 figures, 3 tables, 9 algorithms.

Key Result

Proposition 2.1

Consider a funder who estimates a ranking of proposals from review data in the following fully Bayesian setting. The funder observes review data $y \in Y$ for a set of proposals of true quality $\theta \in X$, for some measurable sets $X$ and $Y$. The review data and true quality are generated joint

Figures (11)

  • Figure 1: Example that violates monotonicity with respect to $k$ for Swiss NSF and our MERIT algorithm. When $k=1$, $p_2 = 1/2$ for both algorithms. However, when $k=2$, $p_2 = 1/3$ for both algorithms.
  • Figure 2: Example that violates maximum instability for Swiss NSF and randomize-above-threshold, with $k=3$. Slightly decreasing the upper bound and point estimate of proposal $3$, changes the algorithm's behavior from selecting the top $3$ deterministically (left) to sampling among all $10$ proposals uniformly at random (right).
  • Figure 3: Proportion of top-$k$ proposals selected by different methods with quality data generated under the Swiss NSF model of linear miscalibration and under our model of worst-case over feasible rankings. MERIT matches performance of algorithms designed for the Swiss NSF's linear model, with expected utility averaged over 50 samples of synthetic data and error bars showing 95% CI for the sample mean. The gap between MERIT and other methods in the worst-case over intervals defined by our model can be substantial, as shown by the gaps in NeurIPS Gaussian, NeurIPS Subjectivity, and ICLR Subjectivity.
  • Figure 4: Ablation study comparing methods under the Swiss NSF's model of linear miscalibration with varying levels of miscalibration. Error bars show bootstrapped 95% CIs for the sample mean over 50 samples of randomly generated data from the model.
  • Figure 5: Worst-case utility over Manski bound intervals as a function of the fraction of reviewer-proposal pairs missing, with random dropping of review scores to increase sparsity for the Swiss NSF panel review dataset. Error bars show bootstrapped 95% CIs for the sample mean over 50 trials of randomly dropping review scores.
  • ...and 6 more figures

Theorems & Definitions (29)

  • Proposition 2.1: Optimality of deterministic selection in the fully Bayesian setting
  • Theorem 4.1: Polynomial time solution
  • Lemma 4.2: Polynomial-time separation oracle
  • Definition 4.3: Number above ($A$) and number below ($B$)
  • Definition 4.4: Monotonically ordered subset
  • Theorem 4.5: Post-processing for ex post validity
  • Definition 5.1: Selection rule
  • Definition 5.2: Monotonicity in budget
  • Definition 5.3: Maximum instability
  • Definition 5.4: Reversal symmetry
  • ...and 19 more