Selection of the Most Probable Best

Taeho Kim; Kyoung-kuk Kim; Eunhye Song

Selection of the Most Probable Best

Taeho Kim, Kyoung-kuk Kim, Eunhye Song

TL;DR

The paper develops the Most Probable Best (MPB) framework for ranking and selection under input uncertainty with finite support, formulating a nested R&S problem that selects the solution most likely to be optimal across uncertain parameter realizations. It derives a large-deviations-based OCBA theory for static sampling ratios, then relaxes it to tractable optimality conditions, and designs sequential learning algorithms that achieve strong consistency as the budget grows. To improve finite-sample performance, it introduces kernel ridge regression to pool information across parameter contexts, with theoretical results showing asymptotic equivalence of the LDR under pooling. Empirical results on synthetic and market-competition problems demonstrate superior PFS, FNR, and ACC performance, and the market case highlights MPB’s value in risk-aware decision-making under economic uncertainty. The work provides a rigorous foundation for MPB-based decision making and points to future DP-based and CRN-enhanced extensions for further efficiency gains.

Abstract

We consider an expected-value ranking and selection (R&S) problem where all k solutions' simulation outputs depend on a common parameter whose uncertainty can be modeled by a distribution. We define the most probable best (MPB) to be the solution that has the largest probability of being optimal with respect to the distribution and design an efficient sequential sampling algorithm to learn the MPB when the parameter has a finite support. We derive the large deviations rate of the probability of falsely selecting the MPB and formulate an optimal computing budget allocation problem to find the rate-maximizing static sampling ratios. The problem is then relaxed to obtain a set of optimality conditions that are interpretable and computationally efficient to verify. We devise a series of algorithms that replace the unknown means in the optimality conditions with their estimates and prove the algorithms' sampling ratios achieve the conditions as the simulation budget increases. Furthermore, we show that the empirical performances of the algorithms can be significantly improved by adopting the kernel ridge regression for mean estimation while achieving the same asymptotic convergence results. The algorithms are benchmarked against a state-of-the-art contextual R&S algorithm and demonstrated to have superior empirical performances.

Selection of the Most Probable Best

TL;DR

Abstract

Paper Structure (24 sections, 29 theorems, 145 equations, 6 figures, 9 tables, 5 algorithms)

This paper contains 24 sections, 29 theorems, 145 equations, 6 figures, 9 tables, 5 algorithms.

Introduction
Problem Definition
Learning Models for the MPB Estimation
Asymptotic Analysis of Sampling Allocation
Exact Optimal Computing Budget Allocation Formulation
Relaxed OCBA and its Optimality Conditions
Sufficient Conditions to Preserve Zero Sampling Ratios in the Exact OCBA Formulation
Sequential Learning Procedures
Algorithms for Learning Both MPB and its Favorable Set
Improving Finite-Sample Performance with Kernel Ridge Regression
Empirical Analysis
Synthetic example
Market simulation for new product release
Conclusions
Relations to Other Formulation
...and 9 more sections

Key Result

Theorem 1

Let $x^+ = \max(x, 0)$. Given fixed $\boldsymbol{\alpha}$, define and $\widetilde{\mathrm{LDR}}_{j, i^*} := \min\nolimits_{\BFM \in \mathcal{A}_j} \sum\nolimits_{(i, \theta_b) \in I(\BFM)} \widetilde{G}_{i}(\theta_b)$. Then, we have (i) $\liminf_{n\rightarrow \infty}-\frac{1}{n}\log\mathrm{P}(m_{i, b}^n = 1) = \widetilde{G}_i(\theta_b)$; (ii) $\liminf\nolimits_{n

Figures (6)

Figure 1: Performance measure estimates from 10,000 macro runs for each scenario.
Figure 2: The sum of sampling ratio allocated at the adversarial set paired with $i^*$ for Baseline and Scenario 2.
Figure 3: Performance measure estimates from 100,000 macro runs.
Figure EC.1: Illustration of $\min_{i \neq i^b}\Gamma_{i, n+t}(\theta_b)$ (black line) and $\min_{i \neq i^{b^{\prime}}}\Gamma_{i, n+t}(\theta_{b^{\prime}})$ (red line) when $\mathcal{M}$ is nonempty.
Figure EC.2: Performance measure estimates from 10,000 macro runs for Scenarios 4 and 5.
...and 1 more figures

Theorems & Definitions (30)

Theorem 1
Lemma 1
Theorem 2
Proposition 1
Theorem 3: Optimality conditions for \ref{['opt:aOCBA']}
Proposition 2
Corollary 1
Proposition 3
Proposition 4
Theorem 4
...and 20 more

Selection of the Most Probable Best

TL;DR

Abstract

Selection of the Most Probable Best

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (6)

Theorems & Definitions (30)