Table of Contents
Fetching ...

Constrained Best Arm Identification with Tests for Feasibility

Ting Cai, Kirthevasan Kandasamy

TL;DR

This paper defines a novel constrained best-arm identification problem where each arm has a performance distribution and multiple feasibility tests that can be tested separately. It introduces a LUCB-inspired algorithm that adaptively tests at most one feasibility constraint per arm or the performance, and proves a fixed-confidence guarantee with a tight, gap-dependent lower bound and a matching upper bound. The key contributions are a problem-dependent complexity framework, a $delta$-correct algorithm with asymptotically optimal sample complexity as $delta o0$, and strong empirical results on synthetic and real-world drug-discovery datasets. The work demonstrates that testing feasibility separately can drastically reduce samples, enabling efficient identification of feasible arms with maximal performance in practical settings.

Abstract

Best arm identification (BAI) aims to identify the highest-performance arm among a set of $K$ arms by collecting stochastic samples from each arm. In real-world problems, the best arm needs to satisfy additional feasibility constraints. While there is limited prior work on BAI with feasibility constraints, they typically assume the performance and constraints are observed simultaneously on each pull of an arm. However, this assumption does not reflect most practical use cases, e.g., in drug discovery, we wish to find the most potent drug whose toxicity and solubility are below certain safety thresholds. These safety experiments can be conducted separately from the potency measurement. Thus, this requires designing BAI algorithms that not only decide which arm to pull but also decide whether to test for the arm's performance or feasibility. In this work, we study feasible BAI which allows a decision-maker to choose a tuple $(i,\ell)$, where $i\in [K]$ denotes an arm and $\ell$ denotes whether she wishes to test for its performance ($\ell=0$) or any of its $N$ feasibility constraints ($\ell\in[N]$). We focus on the fixed confidence setting, which is to identify the \textit{feasible} arm with the \textit{highest performance}, with a probability of at least $1-δ$. We propose an efficient algorithm and upper-bound its sample complexity, showing our algorithm can naturally adapt to the problem's difficulty and eliminate arms by worse performance or infeasibility, whichever is easier. We complement this upper bound with a lower bound showing that our algorithm is \textit{asymptotically ($δ\rightarrow 0$) optimal}. Finally, we empirically show that our algorithm outperforms other state-of-the-art BAI algorithms in both synthetic and real-world datasets.

Constrained Best Arm Identification with Tests for Feasibility

TL;DR

This paper defines a novel constrained best-arm identification problem where each arm has a performance distribution and multiple feasibility tests that can be tested separately. It introduces a LUCB-inspired algorithm that adaptively tests at most one feasibility constraint per arm or the performance, and proves a fixed-confidence guarantee with a tight, gap-dependent lower bound and a matching upper bound. The key contributions are a problem-dependent complexity framework, a -correct algorithm with asymptotically optimal sample complexity as , and strong empirical results on synthetic and real-world drug-discovery datasets. The work demonstrates that testing feasibility separately can drastically reduce samples, enabling efficient identification of feasible arms with maximal performance in practical settings.

Abstract

Best arm identification (BAI) aims to identify the highest-performance arm among a set of arms by collecting stochastic samples from each arm. In real-world problems, the best arm needs to satisfy additional feasibility constraints. While there is limited prior work on BAI with feasibility constraints, they typically assume the performance and constraints are observed simultaneously on each pull of an arm. However, this assumption does not reflect most practical use cases, e.g., in drug discovery, we wish to find the most potent drug whose toxicity and solubility are below certain safety thresholds. These safety experiments can be conducted separately from the potency measurement. Thus, this requires designing BAI algorithms that not only decide which arm to pull but also decide whether to test for the arm's performance or feasibility. In this work, we study feasible BAI which allows a decision-maker to choose a tuple , where denotes an arm and denotes whether she wishes to test for its performance () or any of its feasibility constraints (). We focus on the fixed confidence setting, which is to identify the \textit{feasible} arm with the \textit{highest performance}, with a probability of at least . We propose an efficient algorithm and upper-bound its sample complexity, showing our algorithm can naturally adapt to the problem's difficulty and eliminate arms by worse performance or infeasibility, whichever is easier. We complement this upper bound with a lower bound showing that our algorithm is \textit{asymptotically () optimal}. Finally, we empirically show that our algorithm outperforms other state-of-the-art BAI algorithms in both synthetic and real-world datasets.

Paper Structure

This paper contains 27 sections, 20 theorems, 129 equations, 3 figures, 3 tables, 1 algorithm.

Key Result

Theorem 1

Let $\nu$ denote a bandit instance with Gaussian observations satisfying assumptions in § PROBLEM SETUP. Let $\delta \in (0,1)$ and $\mathcal{H}$ defined in eqn: H. Any algorithm $\mathcal{A}$ that is $\delta$-correct has a stopping time $\tau$ on $\nu$ that satisfies

Figures (3)

  • Figure 1: An example bandit instance when $K=5$ and $N=1$. The optimal feasible arm is $i^\star=2$.
  • Figure 2: Results for Experiment 2. Results are averaged over $10$ runs and the error bars are standard deviations.
  • Figure 3: Drug discovery application: we exclude P-first due to poor performance. Results are averaged over 10 runs and the error bars are the standard deviations.

Theorems & Definitions (37)

  • Theorem 1
  • Theorem 2
  • Corollary 2.1
  • proof : Proof of Theorem \ref{['thm: upper bound']}
  • Lemma 3
  • Lemma 4
  • proof : Proof of Lemma \ref{['lemma: clean event']}
  • proof : Proof of Lemma \ref{['appendix: lemma:delta-correct']}
  • Lemma 5
  • proof : Proof of Lemma \ref{['lemma: sample complexity']}
  • ...and 27 more