Bandit Pareto Set Identification: the Fixed Budget Setting

Cyrille Kone; Emilie Kaufmann; Laura Richert

Bandit Pareto Set Identification: the Fixed Budget Setting

Cyrille Kone, Emilie Kaufmann, Laura Richert

TL;DR

The paper addresses fixed-budget Pareto Set Identification (PSI) in multi-objective bandits, formalizing the Pareto set $\\mathcal{S}^\\star$ and error $e_T(\\nu)$ under a budget $T$. It introduces Empirical Gap Elimination (EGE), a round-based, gap-driven elimination framework, and instantiates it as EGE-SR and EGE-SH, achieving exponential decay of the misidentification probability with budget. The authors establish an instance-dependent lower bound, relate the exponential rates to the complexity term $H_2(\\nu)$, and show that these algorithms are near-optimal in the worst case; they also compare to a tuned fixed-confidence approach and demonstrate robustness through extensive experiments on real-world and synthetic data. Additionally, the paper extends the approach to PSI-$k$ relaxations (PSI-$k$) via algorithm AMP and analyzes fixed-budget behavior, including stopping-time guarantees and sample complexity trade-offs. Overall, the work provides the first principled, scalable, fixed-budget PSI algorithms with strong theoretical guarantees and practical validation, enabling reliable multi-objective pure exploration in budget-constrained settings.

Abstract

We study a multi-objective pure exploration problem in a multi-armed bandit model. Each arm is associated to an unknown multi-variate distribution and the goal is to identify the distributions whose mean is not uniformly worse than that of another distribution: the Pareto optimal set. We propose and analyze the first algorithms for the \emph{fixed budget} Pareto Set Identification task. We propose Empirical Gap Elimination, a family of algorithms combining a careful estimation of the ``hardness to classify'' each arm in or out of the Pareto set with a generic elimination scheme. We prove that two particular instances, EGE-SR and EGE-SH, have a probability of error that decays exponentially fast with the budget, with an exponent supported by an information theoretic lower-bound. We complement these findings with an empirical study using real-world and synthetic datasets, which showcase the good performance of our algorithms.

Bandit Pareto Set Identification: the Fixed Budget Setting

TL;DR

The paper addresses fixed-budget Pareto Set Identification (PSI) in multi-objective bandits, formalizing the Pareto set

and error

under a budget

. It introduces Empirical Gap Elimination (EGE), a round-based, gap-driven elimination framework, and instantiates it as EGE-SR and EGE-SH, achieving exponential decay of the misidentification probability with budget. The authors establish an instance-dependent lower bound, relate the exponential rates to the complexity term

, and show that these algorithms are near-optimal in the worst case; they also compare to a tuned fixed-confidence approach and demonstrate robustness through extensive experiments on real-world and synthetic data. Additionally, the paper extends the approach to PSI-

relaxations (PSI-

) via algorithm AMP and analyzes fixed-budget behavior, including stopping-time guarantees and sample complexity trade-offs. Overall, the work provides the first principled, scalable, fixed-budget PSI algorithms with strong theoretical guarantees and practical validation, enabling reliable multi-objective pure exploration in budget-constrained settings.

Abstract

Paper Structure (55 sections, 35 theorems, 284 equations, 30 figures, 2 tables, 3 algorithms)

This paper contains 55 sections, 35 theorems, 284 equations, 30 figures, 2 tables, 3 algorithms.

INTRODUCTION
Related work
SETTING
ALGORITHMS
Empirical Gap Elimination
Particular Instances
Alternative Approach
THEORETICAL GUARANTEES
Lower Bound
Sketch of proof of Theorem \ref{['thm:main-res']}
RELAXING PSI
EXPERIMENTAL STUDY
Real-world Datasets
COV-BOOST
Hardware Design
...and 40 more sections

Key Result

Lemma 1

For any arm $i \in [K]$, where $\delta_i^\star := \min_{j\neq i} [\mathop{\mathrm{M}}\nolimits(i, j)\land (\mathop{\mathrm{M}}\nolimits(j, i)^+ +(\Delta_j^\star)^+)].$

Figures (30)

Figure 1: Application 1: COV-BOOST trial
Figure 2: Application 2: Sorting Networks dataset.
Figure 3: Arms on a convex Pareto set.
Figure 4: Each sub-optimal $i$ is only dominated by $i^\star$.
Figure 5: $K=200$ arms on the unit circle.
...and 25 more figures

Theorems & Definitions (56)

Definition 1
Lemma 1
Remark 1
Theorem 1
Corollary 1.1
Theorem 2
Lemma 2
Lemma 3
Lemma 4
Theorem 3
...and 46 more

Bandit Pareto Set Identification: the Fixed Budget Setting

TL;DR

Abstract

Bandit Pareto Set Identification: the Fixed Budget Setting

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (30)

Theorems & Definitions (56)