Table of Contents
Fetching ...

Prophet Inequalities: Separating Random Order from Order Selection

Giordano Giambartolomei, Frederik Mallmann-Trenn, Raimundo Saona

Abstract

Prophet inequalities are a central object of study in optimal stopping theory. A gambler is sent values in an online fashion, sampled from an instance of independent distributions, in an adversarial, random or selected order, depending on the model. When observing each value, the gambler either accepts it as a reward or irrevocably rejects it and proceeds to observe the next value. The goal of the gambler, who cannot see the future, is maximising the expected value of the reward while competing against the expectation of a prophet (the offline maximum). In other words, one seeks to maximise the gambler-to-prophet ratio of the expectations. The model, in which the gambler selects the arrival order first, and then observes the values, is known as Order Selection. In this model a ratio of $0.7251$ is attainable for any instance. Recently, this has been improved up to $0.7258$ by Bubna and Chiplunkar (2023). If the gambler chooses the arrival order (uniformly) at random, we obtain the Random Order model. The worst case ratio over all possible instances has been extensively studied for at least $40$ years. In a computer-assisted proof, Bubna and Chiplunkar (2023) also showed that this ratio is at most $0.7254$ for the Random Order model, thus establishing for the first time that carefully choosing the order, instead of simply taking it at random, benefits the gambler. We give an alternative, non-simulation-assisted proof of this fact, by showing mathematically that in the Random Order model, no algorithm can achieve a ratio larger than $0.7235$. This sets a new state-of-the-art hardness for this model, and establishes more formally that there is a real benefit in choosing the order.

Prophet Inequalities: Separating Random Order from Order Selection

Abstract

Prophet inequalities are a central object of study in optimal stopping theory. A gambler is sent values in an online fashion, sampled from an instance of independent distributions, in an adversarial, random or selected order, depending on the model. When observing each value, the gambler either accepts it as a reward or irrevocably rejects it and proceeds to observe the next value. The goal of the gambler, who cannot see the future, is maximising the expected value of the reward while competing against the expectation of a prophet (the offline maximum). In other words, one seeks to maximise the gambler-to-prophet ratio of the expectations. The model, in which the gambler selects the arrival order first, and then observes the values, is known as Order Selection. In this model a ratio of is attainable for any instance. Recently, this has been improved up to by Bubna and Chiplunkar (2023). If the gambler chooses the arrival order (uniformly) at random, we obtain the Random Order model. The worst case ratio over all possible instances has been extensively studied for at least years. In a computer-assisted proof, Bubna and Chiplunkar (2023) also showed that this ratio is at most for the Random Order model, thus establishing for the first time that carefully choosing the order, instead of simply taking it at random, benefits the gambler. We give an alternative, non-simulation-assisted proof of this fact, by showing mathematically that in the Random Order model, no algorithm can achieve a ratio larger than . This sets a new state-of-the-art hardness for this model, and establishes more formally that there is a real benefit in choosing the order.
Paper Structure (37 sections, 6 theorems, 57 equations, 1 figure)

This paper contains 37 sections, 6 theorems, 57 equations, 1 figure.

Key Result

Theorem 1.1

On Instance instance, the optimal stopping rule $T\in C^{n+1}$ is such that As a consequence, RO is $0.7235$-hard. Therefore OS is separated from RO, and the former beats the latter.

Figures (1)

  • Figure 1: Simulation of the dynamic program (reference to the code shared is in \ref{['code']}) for Instance \ref{['instance']} with $a=0.789$, $b=1.24$, $p=0.421$, $n=10^6$. \ref{['fig: prophet full']} shows the sequences $\{\phi_k\}$ (blue), $\{\bar{\phi}_k\}$ (amber), and the values $a$ (green), $b$ (red). \ref{['fig: prophet zoom']} shows a zoom on the intersection where $\phi_k \approx b$. Informally, the abscissa of the intercept of the blue dotted curve with the red line corresponds to the smallest acceptance time $k_{10^6}\approx2253$; the abscissa of the intercept of the amber dotted curve with the red line is the second largest acceptance time $\bar{k}_{10^6}\approx211231$; the abscissa of the intercept of the blue dotted curve with the green line is the largest acceptance time $j_{10^6}\approx415187$, as per \ref{['accept']} (c, d, e).

Theorems & Definitions (22)

  • Definition 1.1
  • Theorem 1.1
  • Definition 2.1
  • Theorem 2.1
  • Remark 2.1
  • Remark 2.2
  • Remark 2.3
  • Remark 2.4
  • Remark 2.5
  • Definition 2.2
  • ...and 12 more