Table of Contents
Fetching ...

An Asymptotically Optimal Algorithm for the Convex Hull Membership Problem

Gang Qiao, Ambuj Tewari

TL;DR

This work presents the first asymptotically optimal algorithm called Thompson-CHM, whose modular design consists of a stopping rule and a sampling rule, and extends the algorithm to settings that generalize several important problems in the multi-armed bandit literature.

Abstract

We study the convex hull membership (CHM) problem in the pure exploration setting where one aims to efficiently and accurately determine if a given point lies in the convex hull of means of a finite set of distributions. We give a complete characterization of the sample complexity of the CHM problem in the one-dimensional case. We present the first asymptotically optimal algorithm called Thompson-CHM, whose modular design consists of a stopping rule and a sampling rule. In addition, we extend the algorithm to settings that generalize several important problems in the multi-armed bandit literature. Furthermore, we discuss the extension of Thompson-CHM to higher dimensions. Finally, we provide numerical experiments to demonstrate the empirical behavior of the algorithm matches our theoretical results for realistic time horizons.

An Asymptotically Optimal Algorithm for the Convex Hull Membership Problem

TL;DR

This work presents the first asymptotically optimal algorithm called Thompson-CHM, whose modular design consists of a stopping rule and a sampling rule, and extends the algorithm to settings that generalize several important problems in the multi-armed bandit literature.

Abstract

We study the convex hull membership (CHM) problem in the pure exploration setting where one aims to efficiently and accurately determine if a given point lies in the convex hull of means of a finite set of distributions. We give a complete characterization of the sample complexity of the CHM problem in the one-dimensional case. We present the first asymptotically optimal algorithm called Thompson-CHM, whose modular design consists of a stopping rule and a sampling rule. In addition, we extend the algorithm to settings that generalize several important problems in the multi-armed bandit literature. Furthermore, we discuss the extension of Thompson-CHM to higher dimensions. Finally, we provide numerical experiments to demonstrate the empirical behavior of the algorithm matches our theoretical results for realistic time horizons.
Paper Structure (22 sections, 11 theorems, 47 equations, 2 figures, 2 algorithms)

This paper contains 22 sections, 11 theorems, 47 equations, 2 figures, 2 algorithms.

Key Result

Theorem 1

Given a threshold $\gamma \in \mathbb R$, the expected sample complexity $\mathbb{E}_{\boldsymbol{\mu}}[\tau]$ of any $\delta$-correct 1-dimensional CHM strategy satisfies $\liminf_{\delta \rightarrow 0} \frac{\mathbb{E}_{\boldsymbol{\mu}}[\tau]}{\ln(1/\delta)} \ge T^*(\boldsymbol{\mu}),$ where and

Figures (2)

  • Figure 1: Sample complexity for different $\gamma$'s in feasible cases (left) and infeasible cases (right).
  • Figure 2: Empirical proportion of samples compared to optimal allocation $\bm w^*(\bm \mu)$ in feasible cases (left) and infeasible cases (right) estimated using 100 repetitions.

Theorems & Definitions (13)

  • Definition 3.1
  • Definition 3.2
  • Theorem 1
  • Lemma 5.1
  • Theorem 2
  • Proposition 5.2
  • Theorem 3
  • Lemma 6.1
  • Theorem 4
  • Theorem 5
  • ...and 3 more