An Upper Confidence Bound Approach to Estimating the Maximum Mean
Zhang Kun, Liu Guangwu, Shi Wen
TL;DR
This work tackles estimating the maximum mean $\mu^* = \max_k \mu_k$ across $K$ stochastic systems under adaptive sampling via a generalized UCB policy. It introduces two estimators, the Grand Average $\widetilde{M}_n$ and the Largest-Size Average $M_{I^*_n}$ (LSA), and establishes strong consistency, $\text{MSE}=O(1/n)$ or $O(\nu_n/n^{3/2})$ biases, and central limit theorems for both, enabling asymptotically valid CIs. The LSA estimator is shown to have faster bias decay $O(\nu_n/n^{3/2})$ than GA's $O(\nu_n/n)$, and both support a single, powerful hypothesis test for maximal-mean comparisons in clinical trials and related settings. Numerical experiments on coherent risk measures, clinical trials, and call-center robustness confirm that LSA offers superior finite-sample performance and tighter inference while maintaining asymptotic guarantees.
Abstract
Estimating the maximum mean finds a variety of applications in practice. In this paper, we study estimation of the maximum mean using an upper confidence bound (UCB) approach where the sampling budget is adaptively allocated to one of the systems. We study in depth the existing grand average (GA) estimator, and propose a new largest-size average (LSA) estimator. Specifically, we establish statistical guarantees, including strong consistency, asymptotic mean squared errors, and central limit theorems (CLTs) for both estimators, which are new to the literature. We show that LSA is preferable over GA, as the bias of the former decays at a rate much faster than that of the latter when sample size increases. By using the CLTs, we further construct asymptotically valid confidence intervals for the maximum mean, and propose a single hypothesis test for a multiple comparison problem with application to clinical trials. Statistical efficiency of the resulting point and interval estimates and the proposed single hypothesis test is demonstrated via numerical examples.
