Bandit Pareto Set Identification: the Fixed Budget Setting
Cyrille Kone, Emilie Kaufmann, Laura Richert
TL;DR
The paper addresses fixed-budget Pareto Set Identification (PSI) in multi-objective bandits, formalizing the Pareto set $\\mathcal{S}^\\star$ and error $e_T(\\nu)$ under a budget $T$. It introduces Empirical Gap Elimination (EGE), a round-based, gap-driven elimination framework, and instantiates it as EGE-SR and EGE-SH, achieving exponential decay of the misidentification probability with budget. The authors establish an instance-dependent lower bound, relate the exponential rates to the complexity term $H_2(\\nu)$, and show that these algorithms are near-optimal in the worst case; they also compare to a tuned fixed-confidence approach and demonstrate robustness through extensive experiments on real-world and synthetic data. Additionally, the paper extends the approach to PSI-$k$ relaxations (PSI-$k$) via algorithm AMP and analyzes fixed-budget behavior, including stopping-time guarantees and sample complexity trade-offs. Overall, the work provides the first principled, scalable, fixed-budget PSI algorithms with strong theoretical guarantees and practical validation, enabling reliable multi-objective pure exploration in budget-constrained settings.
Abstract
We study a multi-objective pure exploration problem in a multi-armed bandit model. Each arm is associated to an unknown multi-variate distribution and the goal is to identify the distributions whose mean is not uniformly worse than that of another distribution: the Pareto optimal set. We propose and analyze the first algorithms for the \emph{fixed budget} Pareto Set Identification task. We propose Empirical Gap Elimination, a family of algorithms combining a careful estimation of the ``hardness to classify'' each arm in or out of the Pareto set with a generic elimination scheme. We prove that two particular instances, EGE-SR and EGE-SH, have a probability of error that decays exponentially fast with the budget, with an exponent supported by an information theoretic lower-bound. We complement these findings with an empirical study using real-world and synthetic datasets, which showcase the good performance of our algorithms.
