Table of Contents
Fetching ...

Bandit Pareto Set Identification: the Fixed Budget Setting

Cyrille Kone, Emilie Kaufmann, Laura Richert

TL;DR

The paper addresses fixed-budget Pareto Set Identification (PSI) in multi-objective bandits, formalizing the Pareto set $\\mathcal{S}^\\star$ and error $e_T(\\nu)$ under a budget $T$. It introduces Empirical Gap Elimination (EGE), a round-based, gap-driven elimination framework, and instantiates it as EGE-SR and EGE-SH, achieving exponential decay of the misidentification probability with budget. The authors establish an instance-dependent lower bound, relate the exponential rates to the complexity term $H_2(\\nu)$, and show that these algorithms are near-optimal in the worst case; they also compare to a tuned fixed-confidence approach and demonstrate robustness through extensive experiments on real-world and synthetic data. Additionally, the paper extends the approach to PSI-$k$ relaxations (PSI-$k$) via algorithm AMP and analyzes fixed-budget behavior, including stopping-time guarantees and sample complexity trade-offs. Overall, the work provides the first principled, scalable, fixed-budget PSI algorithms with strong theoretical guarantees and practical validation, enabling reliable multi-objective pure exploration in budget-constrained settings.

Abstract

We study a multi-objective pure exploration problem in a multi-armed bandit model. Each arm is associated to an unknown multi-variate distribution and the goal is to identify the distributions whose mean is not uniformly worse than that of another distribution: the Pareto optimal set. We propose and analyze the first algorithms for the \emph{fixed budget} Pareto Set Identification task. We propose Empirical Gap Elimination, a family of algorithms combining a careful estimation of the ``hardness to classify'' each arm in or out of the Pareto set with a generic elimination scheme. We prove that two particular instances, EGE-SR and EGE-SH, have a probability of error that decays exponentially fast with the budget, with an exponent supported by an information theoretic lower-bound. We complement these findings with an empirical study using real-world and synthetic datasets, which showcase the good performance of our algorithms.

Bandit Pareto Set Identification: the Fixed Budget Setting

TL;DR

The paper addresses fixed-budget Pareto Set Identification (PSI) in multi-objective bandits, formalizing the Pareto set and error under a budget . It introduces Empirical Gap Elimination (EGE), a round-based, gap-driven elimination framework, and instantiates it as EGE-SR and EGE-SH, achieving exponential decay of the misidentification probability with budget. The authors establish an instance-dependent lower bound, relate the exponential rates to the complexity term , and show that these algorithms are near-optimal in the worst case; they also compare to a tuned fixed-confidence approach and demonstrate robustness through extensive experiments on real-world and synthetic data. Additionally, the paper extends the approach to PSI- relaxations (PSI-) via algorithm AMP and analyzes fixed-budget behavior, including stopping-time guarantees and sample complexity trade-offs. Overall, the work provides the first principled, scalable, fixed-budget PSI algorithms with strong theoretical guarantees and practical validation, enabling reliable multi-objective pure exploration in budget-constrained settings.

Abstract

We study a multi-objective pure exploration problem in a multi-armed bandit model. Each arm is associated to an unknown multi-variate distribution and the goal is to identify the distributions whose mean is not uniformly worse than that of another distribution: the Pareto optimal set. We propose and analyze the first algorithms for the \emph{fixed budget} Pareto Set Identification task. We propose Empirical Gap Elimination, a family of algorithms combining a careful estimation of the ``hardness to classify'' each arm in or out of the Pareto set with a generic elimination scheme. We prove that two particular instances, EGE-SR and EGE-SH, have a probability of error that decays exponentially fast with the budget, with an exponent supported by an information theoretic lower-bound. We complement these findings with an empirical study using real-world and synthetic datasets, which showcase the good performance of our algorithms.
Paper Structure (55 sections, 35 theorems, 284 equations, 30 figures, 2 tables, 3 algorithms)

This paper contains 55 sections, 35 theorems, 284 equations, 30 figures, 2 tables, 3 algorithms.

Key Result

Lemma 1

For any arm $i \in [K]$, where $\delta_i^\star := \min_{j\neq i} [\mathop{\mathrm{M}}\nolimits(i, j)\land (\mathop{\mathrm{M}}\nolimits(j, i)^+ +(\Delta_j^\star)^+)].$

Figures (30)

  • Figure 1: Application 1: COV-BOOST trial
  • Figure 2: Application 2: Sorting Networks dataset.
  • Figure 3: Arms on a convex Pareto set.
  • Figure 4: Each sub-optimal $i$ is only dominated by $i^\star$.
  • Figure 5: $K=200$ arms on the unit circle.
  • ...and 25 more figures

Theorems & Definitions (56)

  • Definition 1
  • Lemma 1
  • Remark 1
  • Theorem 1
  • Corollary 1.1
  • Theorem 2
  • Lemma 2
  • Lemma 3
  • Lemma 4
  • Theorem 3
  • ...and 46 more