Identifying Approximate Minimizers under Stochastic Uncertainty
Hessa Al-Thani, Viswanath Nagarajan
TL;DR
This work tackles the stochastic minimum query problem where one must identify a $\delta$-approximate minimum (or maximizer) among $n$ independent, interval-bounded random variables with query costs. It introduces non-adaptive policies that interleave two natural greedy criteria, achieving constant-factor approximations: 4-approximation for unit costs in both SMQ and SMQI, 5.83-approximation for SMQ with non-uniform costs, and 7.47-approximation for SMQI with non-uniform costs. The analysis hinges on stopping-probability comparisons and stochastic dominance, and extends from unit-cost to general-cost settings using power-of-two iterations and a knapsack subroutine. A key contribution is showing that non-adaptive policies can closely approximate adaptive-optimal performance, providing explicit adaptivity-gap bounds and enabling practical, implementable procedures for stochastic exploration under uncertainty. The results advance constant-factor guarantees in stochastic probing and exploration problems, with concrete implications for cost-aware design choices in uncertain environments.
Abstract
We study a fundamental stochastic selection problem involving $n$ independent random variables, each of which can be queried at some cost. Given a tolerance level $δ$, the goal is to find a value that is $δ$-approximately minimum (or maximum) over all the random variables, at minimum expected cost. A solution to this problem is an adaptive sequence of queries, where the choice of the next query may depend on previously-observed values. Two variants arise, depending on whether the goal is to find a $δ$-minimum value or a $δ$-minimizer. When all query costs are uniform, we provide a $4$-approximation algorithm for both variants. When query costs are non-uniform, we provide a $5.83$-approximation algorithm for the $δ$-minimum value and a $7.47$-approximation for the $δ$-minimizer. All our algorithms rely on non-adaptive policies (that perform a fixed sequence of queries), so we also upper bound the corresponding ''adaptivity'' gaps. Our analysis relates the stopping probabilities in the algorithm and optimal policies, where a key step is in proving and using certain stochastic dominance properties.
