Table of Contents
Fetching ...

Markovian Pandora's box

Yuanyuan Yang, Ruimin Zhang, Jamie Morgenstern, Haifeng Xu

TL;DR

This work extends the Pandora's Box framework to include Markovian reward dependencies under precedence constraints encoded by a DAG. The authors develop a Generalized Reservation Value (GRV) framework that yields a fully adaptive (FA) optimal policy when the graph is forest-structured, with polynomial-time algorithms for single-line, multi-line, and forest cases via an equivalent reward table and contraction techniques. Under static transition, they obtain faster, subgraph-based strategies that are near-optimal for multi-lines and provide a 1/2-approximation for forests, with fixed-point iterations ensuring efficient computation. The results advance understanding of sequential exploration with Markovian correlations and offer practical algorithms for data-driven algorithm design where exploring future models incurs costs.

Abstract

In this paper, we study the Markovian Pandora's Box Problem, where decisions are governed by both order constraints and Markovianly correlated rewards, structured within a shared directed acyclic graph. To the best of our knowledge, previous work has not incorporated Markovian dependencies in this setting. This framework is particularly relevant to applications such as data or computation driven algorithm design, where exploration of future models incurs cost. We present optimal fully adaptive strategies where the associated graph forms a forest. Under static transition, we introduce a strategy that achieves a near optimal expected payoff in multi line graphs and a 1/2 approximation in forest-structured graphs. Notably, this algorithm provides a significant speedup over the exact solution, with the improvement becoming more pronounced as the graph size increases. Our findings deepen the understanding of sequential exploration under Markovian correlations in graph-based decision-making.

Markovian Pandora's box

TL;DR

This work extends the Pandora's Box framework to include Markovian reward dependencies under precedence constraints encoded by a DAG. The authors develop a Generalized Reservation Value (GRV) framework that yields a fully adaptive (FA) optimal policy when the graph is forest-structured, with polynomial-time algorithms for single-line, multi-line, and forest cases via an equivalent reward table and contraction techniques. Under static transition, they obtain faster, subgraph-based strategies that are near-optimal for multi-lines and provide a 1/2-approximation for forests, with fixed-point iterations ensuring efficient computation. The results advance understanding of sequential exploration with Markovian correlations and offer practical algorithms for data-driven algorithm design where exploring future models incurs costs.

Abstract

In this paper, we study the Markovian Pandora's Box Problem, where decisions are governed by both order constraints and Markovianly correlated rewards, structured within a shared directed acyclic graph. To the best of our knowledge, previous work has not incorporated Markovian dependencies in this setting. This framework is particularly relevant to applications such as data or computation driven algorithm design, where exploration of future models incurs cost. We present optimal fully adaptive strategies where the associated graph forms a forest. Under static transition, we introduce a strategy that achieves a near optimal expected payoff in multi line graphs and a 1/2 approximation in forest-structured graphs. Notably, this algorithm provides a significant speedup over the exact solution, with the improvement becoming more pronounced as the graph size increases. Our findings deepen the understanding of sequential exploration under Markovian correlations in graph-based decision-making.

Paper Structure

This paper contains 37 sections, 39 theorems, 53 equations, 1 figure, 2 tables, 3 algorithms.

Key Result

Theorem 1.1

For a Markovian Pandora’s Box problem with a forest-structured precedence graph, there exists a fully adaptive algorithm that achieves the optimal expected payoff in polynomial time and space.

Figures (1)

  • Figure 1: Reduction From Tree to Multi-Line Setting

Theorems & Definitions (77)

  • Theorem 1.1: Optimal Solution for Forest-Structured Graphs (Lem. \ref{['lem:app:update_GRV_forest']} and Thm. \ref{['thm:app:GRV_opt_forest']})
  • Theorem 1.2: Faster Solution Under Static Transition
  • Definition 2.2: Adaptivity in Strategy Design: NA, PA, FA
  • Lemma 2.3: The sub-optimality of PA strategies
  • Theorem 2.4: Lower bound for Pandora's problem
  • Definition 3.1: Hyperbox
  • Definition 3.2: Equivalent Reward
  • Definition 3.3: Generalized Reservation Value
  • Lemma 3.4: Properties of GRV
  • Theorem 3.5: Optimality of GRV
  • ...and 67 more