Learning in Prophet Inequalities with Noisy Observations

Jung-hun Kim; Vianney Perchet

Learning in Prophet Inequalities with Noisy Observations

Jung-hun Kim, Vianney Perchet

Abstract

We study the prophet inequality, a fundamental problem in online decision-making and optimal stopping, in a practical setting where rewards are observed only through noisy realizations and reward distributions are unknown. At each stage, the decision-maker receives a noisy reward whose true value follows a linear model with an unknown latent parameter, and observes a feature vector drawn from a distribution. To address this challenge, we propose algorithms that integrate learning and decision-making via lower-confidence-bound (LCB) thresholding. In the i.i.d.\ setting, we establish that both an Explore-then-Decide strategy and an $\varepsilon$-Greedy variant achieve the sharp competitive ratio of $1 - 1/e$, under a mild condition on the optimal value. For non-identical distributions, we show that a competitive ratio of $1/2$ can be guaranteed against a relaxed benchmark. Moreover, with limited window access to past rewards, the tight ratio of $1/2$ against the optimal benchmark is achieved.

Learning in Prophet Inequalities with Noisy Observations

Abstract

Learning in Prophet Inequalities with Noisy Observations

Abstract

Paper Structure

Table of Contents

Key Result

Figures (2)

Theorems & Definitions (28)