Table of Contents
Fetching ...

Optimal Multi-Fidelity Best-Arm Identification

Riccardo Poiani, Rémy Degenne, Emilie Kaufmann, Alberto Maria Metelli, Marcello Restelli

TL;DR

This work addresses fixed-confidence best-arm identification in a multi-fidelity bandit setting where observations can be obtained at varying costs and fidelities. It introduces a tight instance-dependent lower bound on cost complexity based on a KL-information–theoretic functional $F(oldsymbol{\omega}, \boldsymbol{\mu})$ and an associated optimal cost proportion $C^{*}(\boldsymbol{\mu})$, then proposes MF-GRAD, a gradient-based algorithm that achieves a matching upper bound asymptotically. MF-GRAD employs a sub-gradient ascent on cost proportions, converts them to pull-proportions, and uses a GLR stopping rule to guarantee $\,\delta$-correct identification of the best arm at the highest fidelity, with per-iteration complexity $O(K^{2}M^{2})$. Theoretical guarantees show that MF-GRAD is asymptotically optimal in the high-confidence regime, and experiments demonstrate superior empirical cost efficiency over prior MF-BAI approaches, along with insights into the sparsity of optimal fidelities in small MF settings. A key finding is the sparsity pattern where, in many $2\times M$ problems, each arm’s optimal fidelity is concentrated on a single fidelity, informing design of multi-fidelity sampling strategies.

Abstract

In bandit best-arm identification, an algorithm is tasked with finding the arm with highest mean reward with a specified accuracy as fast as possible. We study multi-fidelity best-arm identification, in which the algorithm can choose to sample an arm at a lower fidelity (less accurate mean estimate) for a lower cost. Several methods have been proposed for tackling this problem, but their optimality remain elusive, notably due to loose lower bounds on the total cost needed to identify the best arm. Our first contribution is a tight, instance-dependent lower bound on the cost complexity. The study of the optimization problem featured in the lower bound provides new insights to devise computationally efficient algorithms, and leads us to propose a gradient-based approach with asymptotically optimal cost complexity. We demonstrate the benefits of the new algorithm compared to existing methods in experiments. Our theoretical and empirical findings also shed light on an intriguing concept of optimal fidelity for each arm.

Optimal Multi-Fidelity Best-Arm Identification

TL;DR

This work addresses fixed-confidence best-arm identification in a multi-fidelity bandit setting where observations can be obtained at varying costs and fidelities. It introduces a tight instance-dependent lower bound on cost complexity based on a KL-information–theoretic functional and an associated optimal cost proportion , then proposes MF-GRAD, a gradient-based algorithm that achieves a matching upper bound asymptotically. MF-GRAD employs a sub-gradient ascent on cost proportions, converts them to pull-proportions, and uses a GLR stopping rule to guarantee -correct identification of the best arm at the highest fidelity, with per-iteration complexity . Theoretical guarantees show that MF-GRAD is asymptotically optimal in the high-confidence regime, and experiments demonstrate superior empirical cost efficiency over prior MF-BAI approaches, along with insights into the sparsity of optimal fidelities in small MF settings. A key finding is the sparsity pattern where, in many problems, each arm’s optimal fidelity is concentrated on a single fidelity, informing design of multi-fidelity sampling strategies.

Abstract

In bandit best-arm identification, an algorithm is tasked with finding the arm with highest mean reward with a specified accuracy as fast as possible. We study multi-fidelity best-arm identification, in which the algorithm can choose to sample an arm at a lower fidelity (less accurate mean estimate) for a lower cost. Several methods have been proposed for tackling this problem, but their optimality remain elusive, notably due to loose lower bounds on the total cost needed to identify the best arm. Our first contribution is a tight, instance-dependent lower bound on the cost complexity. The study of the optimization problem featured in the lower bound provides new insights to devise computationally efficient algorithms, and leads us to propose a gradient-based approach with asymptotically optimal cost complexity. We demonstrate the benefits of the new algorithm compared to existing methods in experiments. Our theoretical and empirical findings also shed light on an intriguing concept of optimal fidelity for each arm.
Paper Structure (46 sections, 35 theorems, 107 equations, 15 figures, 5 tables, 1 algorithm)

This paper contains 46 sections, 35 theorems, 107 equations, 15 figures, 5 tables, 1 algorithm.

Key Result

Theorem 3.1

Let $\delta \in (0, 1)$. For any $\delta$-correct strategy, and any multi-fidelity bandit model $\bm{\mu} \in \mathcal{M}_{\textup{MF}}^*$, it holds that: where $C^*(\bm{\mu})^{-1} \coloneqq \sup_{\bm{\omega} \in \Delta_{K \times M}} F(\bm{\omega}, \bm{\mu}) = \sup_{\bm{\omega} \in \Delta_{K \times M}} \min_{a \ne \star} f_{\star, a} (\bm{\omega}, \bm{\mu})$ .

Figures (15)

  • Figure 1: Empirical cost complexity for $1000$ runs times with $\delta=0.01$ on the $4 \times 5$ multi-fidelity bandit.
  • Figure 2: Empirical cost complexity for $1000$ runs times with $\delta=0.01$ on the $5 \times 2$ multi-fidelity bandit.
  • Figure 3: Empirical cost proportions of MF-GRAD for $100000$ iterations on the $5 \times 2$ bandit model. Results are average over $100$ runs and shaded area report $95\%$ confidence intervals. Empirical cost proportions of a certain arm are plotted with the same color. Cost proportions at fidelity $1$ and $2$ are visualized with a circle and a squared respectively.
  • Figure 4: Empirical cost proportions of MF-GRAD for $100000$ iterations on the $5 \times 2$ bandit model of Section \ref{['sec:exp']}. Results are average over $100$ runs and shaded area report $95\%$ confidence intervals. Empirical cost proportions of each arm are plotted with the same color. Cost proportions at fidelity $1$, $2$, $3$, $4$ and $5$ are visualized with circle, squared, cross, triangle, and diamond respectively.
  • Figure 5: Empirical cost complexity for $1000$ runs times with $\delta=0.01$ on the multi-fidelity bandit of Table \ref{['mu:2']}.
  • ...and 10 more figures

Theorems & Definitions (64)

  • Theorem 3.1
  • Theorem 3.2
  • Theorem 4.1
  • Lemma 4.1
  • Theorem 4.2
  • Lemma 4.2
  • Theorem B.1
  • proof
  • Theorem B.1: Theorem 1 in poiani2022multi
  • Lemma B.2
  • ...and 54 more