Multiunit I.I.D. Prophet Inequalities via Extreme Value Asymptotics

Jieming Kong; Karthyek Murthy

Multiunit I.I.D. Prophet Inequalities via Extreme Value Asymptotics

Jieming Kong, Karthyek Murthy

Abstract

We study the i.i.d. $k$-selection prophet inequality problem, where a decision-maker sequentially observes $n$ independent nonnegative rewards and may accept at most $k$ of them without knowledge of future realizations. The objective is to maximize the expected total reward relative to that of a prophet who observes all rewards in advance. This problem captures the performance limits achievable in online resource allocation and underlies posted-price mechanisms in online marketplaces. We characterize the optimal welfare achievable relative to the prophet in terms of $k$ and the extreme value index of the reward distribution, in the asymptotic regime where the number of offers $n$ grows large. This optimal performance ratio turns out to be at least $1-\frac{\log k}{8k}[1+ε]$ for any $ε> 0$ and sufficiently large $k$, improving upon the respective, tight $1 - \frac{1}{\sqrt{2πk}}$ guarantee of static-threshold algorithms. We additionally analyze the certainty-equivalent (CE) heuristic, a widely used online allocation algorithm known to yield optimal regret growth in $n$ when evaluated under the fluid scaling assumption. Even in the absence of the fluid scaling, the CE heuristics's performance improves with $k$ to eventually match the leading order terms of the optimal dynamic program's performance ratio. A finer analysis nevertheless reveals that regret can be divergent and large relative to the optimal dynamic program when $n/k \to \infty$. This highlights the sensitivity in viewing the CE heuristic's performance under the commonly adopted, though subjective, fluid scaling assumption.

Multiunit I.I.D. Prophet Inequalities via Extreme Value Asymptotics

Abstract

We study the i.i.d.

-selection prophet inequality problem, where a decision-maker sequentially observes

independent nonnegative rewards and may accept at most

of them without knowledge of future realizations. The objective is to maximize the expected total reward relative to that of a prophet who observes all rewards in advance. This problem captures the performance limits achievable in online resource allocation and underlies posted-price mechanisms in online marketplaces. We characterize the optimal welfare achievable relative to the prophet in terms of

and the extreme value index of the reward distribution, in the asymptotic regime where the number of offers

grows large. This optimal performance ratio turns out to be at least

for any

and sufficiently large

, improving upon the respective, tight

guarantee of static-threshold algorithms. We additionally analyze the certainty-equivalent (CE) heuristic, a widely used online allocation algorithm known to yield optimal regret growth in

when evaluated under the fluid scaling assumption. Even in the absence of the fluid scaling, the CE heuristics's performance improves with

to eventually match the leading order terms of the optimal dynamic program's performance ratio. A finer analysis nevertheless reveals that regret can be divergent and large relative to the optimal dynamic program when

. This highlights the sensitivity in viewing the CE heuristic's performance under the commonly adopted, though subjective, fluid scaling assumption.

Paper Structure (47 sections, 31 theorems, 305 equations, 3 figures, 1 table)

This paper contains 47 sections, 31 theorems, 305 equations, 3 figures, 1 table.

Introduction
Known worst-case approximation guarantees in the i.i.d. setting
Results on Instance-Dependent Asymptotic Competitive Ratio
A sharp characterization of the asymptotic competitive ratio
Worst-case asymptotic competitive ratio for large $k$
A discussion on the large $n$ assumption and instance-dependent characterizations
Results on comparison of optimal online performance with the CE heuristic
Fixed threshold policies and the CE heuristic
Results on the performance of CE heuristic
Preliminaries from Extreme Value Theory
Optimal Asymptotic Competitive Ratio
A sharp characterization of the asymptotic competitive ratio
An understanding of ${\tt ACR}_k$ for large $k$
Comparisons with the CE Heuristic
Asymptotic competitive ratio of the CE heuristic
...and 32 more sections

Key Result

Theorem 3.1

Let $F$ be a distribution over $\mathbb{R}^+$ that satisfies the extreme value condition. Then the optimal asymptotic competitive ratio attainable by the dynamic program solution is given by where $\gamma$ is the extreme value index of the distribution $F$ and the sequence $\{v_k: k \geq 1\}$ is obtained recursively from $v_1 = 1,$ and for any $k > 1,$$v_k - v_{k-1}$ is the unique positive value

Figures (3)

Figure 1: Heatmaps of the asymptotic ratios as a function of $k$ and $\gamma$.
Figure 2: Finer Comparison between DP and CE of Pareto Distribution
Figure 3: Finer Comparison between DP and CE

Theorems & Definitions (64)

Definition 2.1: Extreme Value Condition
Theorem 3.1
Proposition 3.2
Corollary 3.3
Theorem 4.1
Proposition 4.2
Corollary 4.3
Theorem 4.4
Proposition 4.5
Lemma 6.1
...and 54 more

Multiunit I.I.D. Prophet Inequalities via Extreme Value Asymptotics

Abstract

Multiunit I.I.D. Prophet Inequalities via Extreme Value Asymptotics

Authors

Abstract

Table of Contents

Key Result

Figures (3)

Theorems & Definitions (64)