Optimal Guarantees for Online Selection Over Time
Sebastian Perez-Salazar, Victor Verdugo
TL;DR
The paper advances the theory of prophet inequalities over time (POT) in the IID setting by deriving best-possible worst-case guarantees for both a single-threshold policy and the optimal dynamic programming policy. It develops a density-based analysis and a convex-optimization framework to obtain tight bounds for small numbers of thresholds (1, 2, and 3) and then characterizes the optimal policy for any finite horizon $n$ via a convex program, with a limit analysis yielding an asymptotic ratio near $0.618$. It also extends the discussion to adversarial and random-order models, proving a constant-factor guarantee of about $0.162$ in the random-order POT, and a lower-bound hardness under adversarial ordering. The methods connect threshold-based policies to infinite- and finite-dimensional convex programs, enabling exact computation of worst-case ratios and shedding light on the gap between simple, implementable strategies and the optimal offline benchmark. Overall, the work sharpens the understanding of the trade-off between commitment duration and opportunity capture, with implications for online selection and pricing mechanisms under time-constrained decisions.
Abstract
Prophet inequalities are a cornerstone in optimal stopping and online decision-making. Traditionally, they involve the sequential observation of $n$ non-negative independent random variables and face irrevocable accept-or-reject choices. The goal is to provide policies that provide a good approximation ratio against the optimal offline solution that can access all the values upfront -- the so-called prophet value. In the prophet inequality over time problem (POT), the decision-maker can commit to an accepted value for $τ$ units of time, during which no new values can be accepted. This creates a trade-off between the duration of commitment and the opportunity to capture potentially higher future values. In this work, we provide best possible worst-case approximation ratios in the IID setting of POT for single-threshold algorithms and the optimal dynamic programming policy. We show a single-threshold algorithm that achieves an approximation ratio of $(1+e^{-2})/2\approx 0.567$, and we prove that no single-threshold algorithm can surpass this guarantee. With our techniques, we can analyze simple algorithms using $k$ thresholds and show that with $k=3$ it is possible to get an approximation ratio larger than $\approx 0.602$. Then, for each $n$, we prove it is possible to compute the tight worst-case approximation ratio of the optimal dynamic programming policy for instances with $n$ values by solving a convex optimization program. A limit analysis of the first-order optimality conditions yields a nonlinear differential equation showing that the optimal dynamic programming policy's asymptotic worst-case approximation ratio is $\approx 0.618$. Finally, we extend the discussion to adversarial settings and show an optimal worst-case approximation ratio of $\approx 0.162$ when the values are streamed in random order.
