Table of Contents
Fetching ...

Prior-Dependent Allocations for Bayesian Fixed-Budget Best-Arm Identification in Structured Bandits

Nicolas Nguyen, Imad Aouali, András György, Claire Vernade

TL;DR

This work studies Bayesian fixed-budget BAI in structured bandits and introduces PI-BAI, a non-adaptive allocation-based method that leverages prior information and problem structure to minimize the Bayes risk measured by the probability of error, $\\mathcal{P}_n$. It develops prior-dependent PoE upper bounds for multi-armed, linear, and hierarchical bandits using a novel Bayesian proof strategy that avoids strong prior-homogeneity assumptions. The authors propose several allocation strategies (Optimized, G-optimal, and warm-up TS) and show how these priors-grade designs yield tighter bounds and robust empirical performance, including on real data such as MovieLens. They further analyze robustness to prior misspecification and demonstrate that the $O(1/\\sqrt{n})$ rate persists under mild mis-specifications, while online or offline prior learning can mitigate such effects. Overall, the paper advances Bayesian fixed-budget BAI by providing a versatile, structure-aware framework with theoretical guarantees and practical efficacy.

Abstract

We study the problem of Bayesian fixed-budget best-arm identification (BAI) in structured bandits. We propose an algorithm that uses fixed allocations based on the prior information and the structure of the environment. We provide theoretical bounds on its performance across diverse models, including the first prior-dependent upper bounds for linear and hierarchical BAI. Our key contribution is introducing new proof methods that result in tighter bounds for multi-armed BAI compared to existing methods. We extensively compare our approach to other fixed-budget BAI methods, demonstrating its consistent and robust performance in various settings. Our work improves our understanding of Bayesian fixed-budget BAI in structured bandits and highlights the effectiveness of our approach in practical scenarios.

Prior-Dependent Allocations for Bayesian Fixed-Budget Best-Arm Identification in Structured Bandits

TL;DR

This work studies Bayesian fixed-budget BAI in structured bandits and introduces PI-BAI, a non-adaptive allocation-based method that leverages prior information and problem structure to minimize the Bayes risk measured by the probability of error, . It develops prior-dependent PoE upper bounds for multi-armed, linear, and hierarchical bandits using a novel Bayesian proof strategy that avoids strong prior-homogeneity assumptions. The authors propose several allocation strategies (Optimized, G-optimal, and warm-up TS) and show how these priors-grade designs yield tighter bounds and robust empirical performance, including on real data such as MovieLens. They further analyze robustness to prior misspecification and demonstrate that the rate persists under mild mis-specifications, while online or offline prior learning can mitigate such effects. Overall, the paper advances Bayesian fixed-budget BAI by providing a versatile, structure-aware framework with theoretical guarantees and practical efficacy.

Abstract

We study the problem of Bayesian fixed-budget best-arm identification (BAI) in structured bandits. We propose an algorithm that uses fixed allocations based on the prior information and the structure of the environment. We provide theoretical bounds on its performance across diverse models, including the first prior-dependent upper bounds for linear and hierarchical BAI. Our key contribution is introducing new proof methods that result in tighter bounds for multi-armed BAI compared to existing methods. We extensively compare our approach to other fixed-budget BAI methods, demonstrating its consistent and robust performance in various settings. Our work improves our understanding of Bayesian fixed-budget BAI in structured bandits and highlights the effectiveness of our approach in practical scenarios.
Paper Structure (35 sections, 14 theorems, 101 equations, 13 figures, 1 algorithm)

This paper contains 35 sections, 14 theorems, 101 equations, 13 figures, 1 algorithm.

Key Result

Theorem 4.1

For all $\omega \in \Delta^+_K$, the expected PoE of $\texttt{PI-BAI}\xspace$ that uses allocation $\omega$ under the MAB problem eq:bayes_elim_model_gaussian is upper bounded as where In particular, $\phi_{i,j}(\omega)=\Omega(1)$ depends on the prior parameters and allocation weights, and $\mathcal{P}_n = \mathcal{O}(1/\sqrt{n})$.

Figures (13)

  • Figure 1: Bound of $\texttt{PI-BAI}$ with different weights compared to $\texttt{BayesElim}$atsidakou2022bayesian.
  • Figure 2: Average PoE with varying $n$: comparison to baselines and impact of prior misspecification.
  • Figure 3: Average posterior covariance across all arms for standard MAB and hierarchical model for two settings.
  • Figure 4: Allocations weights $\omega^{\rm{opt}}_i$ and $\omega^{\rm{TS}}_i$.
  • Figure 5: Average PoE of $\texttt{PI-BAI}\xspace$ with varying budget and allocations (uniform, optimized, G-optimal and TS warmed-up weights) for different level of mean misspecification (first row) and variance misspecification (second row).
  • ...and 8 more figures

Theorems & Definitions (23)

  • Theorem 4.1: Upper bound for MAB
  • Theorem 4.2: Upper bound for linear bandits
  • Corollary 4.3: Upper bound of $\texttt{PI-BAI}\xspace(\text{G-opt})$
  • Theorem 4.4: Upper bound for hierarchical bandits
  • Lemma 4.5: Upper bound for $\texttt{PI-BAI}\xspace$ with misspecified prior parameters
  • Lemma C.1: Gaussian posterior update
  • proof : Proof of \ref{['lemma:gaussian_posterior_update']}
  • Lemma C.2: Joint effect posterior
  • proof : Proof of \ref{['lemma:joint_effect_posterior']}
  • Lemma C.3: Conditional arm posteriors
  • ...and 13 more