Table of Contents
Fetching ...

Prophet Inequalities for Bandits, Cabinets, and DAGs

Robin Bowers, Elias Lindgren, Bo Waggoner

TL;DR

This work addresses online selection among multiple costly Markov search processes (MSPs) on a finite acyclic graph, seeking a feasible subset that maximizes welfare. It develops a robust local-to-global approach using SAUP (single-agent utility problem) and threshold-based policies, leveraged through reductions to Pandora's Cabinets and bandit indices to handle highly interactive MSPs without relying on index theorems. The main contribution is a computationally efficient $\frac{1}{2}-\epsilon$ prophet inequality for Combinatorial Markov Search under any matroid constraint, together with a polynomial-time method to approximate the ex-ante optimum via convex optimization and an FPTAS. The results unify Bandits, Cabinets, and DAGs into a coherent framework, enabling incentive-compatible mechanisms with constant Price of Anarchy in settings where agents perform costly, strategic search, and extend classical prophet inequalities to broad interactive decision problems.

Abstract

A decisionmaker faces $n$ alternatives, each of which represents a potential reward. After investing costly resources into investigating the alternatives, the decisionmaker may select one, or more generally a feasible subset, and obtain the associated reward(s). The objective is to maximize the sum of rewards minus total costs invested. We consider this problem under a general model of an alternative as a "Markov Search Process," a type of undiscounted Markov Decision Process on a finite acyclic graph. Even simple cases generalize NP-hard problems such as Pandora's Box with nonobligatory inspection. Despite the apparently adaptive and interactive nature of the problem, we prove optimal prophet inequalities for this problem under a variety of combinatorial constraints. That is, we give approximation algorithms that interact with the alternatives sequentially, where each must be fully explored and either selected or else discarded before the next arrives. In particular, we obtain a computationally efficient $\frac{1}{2}-ε$ prophet inequality for Combinatorial Markov Search subject to any matroid constraint. This result implies incentive-compatible mechanisms with constant Price of Anarchy for serving single-parameter agents when the agents strategically conduct independent, costly search processes to discover their values.

Prophet Inequalities for Bandits, Cabinets, and DAGs

TL;DR

This work addresses online selection among multiple costly Markov search processes (MSPs) on a finite acyclic graph, seeking a feasible subset that maximizes welfare. It develops a robust local-to-global approach using SAUP (single-agent utility problem) and threshold-based policies, leveraged through reductions to Pandora's Cabinets and bandit indices to handle highly interactive MSPs without relying on index theorems. The main contribution is a computationally efficient prophet inequality for Combinatorial Markov Search under any matroid constraint, together with a polynomial-time method to approximate the ex-ante optimum via convex optimization and an FPTAS. The results unify Bandits, Cabinets, and DAGs into a coherent framework, enabling incentive-compatible mechanisms with constant Price of Anarchy in settings where agents perform costly, strategic search, and extend classical prophet inequalities to broad interactive decision problems.

Abstract

A decisionmaker faces alternatives, each of which represents a potential reward. After investing costly resources into investigating the alternatives, the decisionmaker may select one, or more generally a feasible subset, and obtain the associated reward(s). The objective is to maximize the sum of rewards minus total costs invested. We consider this problem under a general model of an alternative as a "Markov Search Process," a type of undiscounted Markov Decision Process on a finite acyclic graph. Even simple cases generalize NP-hard problems such as Pandora's Box with nonobligatory inspection. Despite the apparently adaptive and interactive nature of the problem, we prove optimal prophet inequalities for this problem under a variety of combinatorial constraints. That is, we give approximation algorithms that interact with the alternatives sequentially, where each must be fully explored and either selected or else discarded before the next arrives. In particular, we obtain a computationally efficient prophet inequality for Combinatorial Markov Search subject to any matroid constraint. This result implies incentive-compatible mechanisms with constant Price of Anarchy for serving single-parameter agents when the agents strategically conduct independent, costly search processes to discover their values.

Paper Structure

This paper contains 44 sections, 30 theorems, 46 equations, 2 figures.

Key Result

Theorem 1

For the problem of Combinatorial Markov Search subject to any matroid feasibility constraint, there exists an online algorithm running in time polynomial in the input size and $\frac{1}{\epsilon}$ that, given any $\epsilon > 0$, guarantees a prophet inequality of $\frac{1}{2} - \epsilon$.

Figures (2)

  • Figure 1: Bandits, Cabinets, and DAGs. Simplified view of several decisionmaking structures considered in this paper. Time moves from left to right. Edges from each state node show possible actions, where the out degree is the number of available actions at a given state. Each decision incurs a cost and results in a stochastic state transition (omitted from the figure to focus on the differences in settings). (\ref{['subfig:multistage-pandora']}) represents a bandit process where there is only one available action from every state, i.e. to advance the process. (\ref{['subfig:cabinets']}) represents Pandora's Cabinets model in which there is an initial decision between several alternatives (i.e. which "drawer" to open), and each alternative is a bandit process. (\ref{['subfig:pandora-full-tree']}) represents the most general Markov Search Process model, where the state transitions may form an arbitrary DAG. In each of the settings in this paper, $n$ structures arrive sequentially, and each must be explored and either selected or discarded before the next comes.
  • Figure 2: Reductions. Given any downward-closed constraint, an ex-ante prophet inequality for one setting implies one for the following setting. The approximation guarantee is preserved. However, computational efficiency and incentive compatibility are not necessarily preserved. In the case of matroid constraints, both efficiency and incentive compatibility can be preserved with a loss of an arbitrarily small $\epsilon$ in the approximation factor (Corollary \ref{['cor:dags-matroid-approx']}).

Theorems & Definitions (70)

  • Theorem : Main result, Corollary \ref{['cor:dags-matroid-approx']}
  • Proposition : Proposition \ref{['prop:max-saup']}
  • Theorem : Theorem \ref{['thm:classic-cp-dc']}
  • Theorem : Theorem \ref{['thm:classic-cp-saup']}
  • Theorem : Theorem \ref{['thm:cp-pc']}
  • Definition 2.1: $\mathcal{P}_{\mathcal{F}}$, ex-ante feasible
  • Definition 2.2: Bandit
  • Definition 2.3: Non-exposed
  • Lemma 2.1: kleinberg2016descendingbowers2024matching
  • Definition : SAUP, informal sketch
  • ...and 60 more