LinearAPT: An Adaptive Algorithm for the Fixed-Budget Thresholding Linear Bandit Problem

Yun-Ang Wu; Yun-Da Tsai; Shou-De Lin

LinearAPT: An Adaptive Algorithm for the Fixed-Budget Thresholding Linear Bandit Problem

Yun-Ang Wu, Yun-Da Tsai, Shou-De Lin

TL;DR

This study dives into the Thresholding Linear Bandit problem, a nuanced domain within stochastic Multi-Armed Bandit problems, focusing on maximizing decision accuracy against a linearly defined threshold under resource constraints, and presents LinearAPT, a novel algorithm designed for the fixed budget setting of TLB, providing an efficient solution to optimize sequential decision-making.

Abstract

In this study, we delve into the Thresholding Linear Bandit (TLB) problem, a nuanced domain within stochastic Multi-Armed Bandit (MAB) problems, focusing on maximizing decision accuracy against a linearly defined threshold under resource constraints. We present LinearAPT, a novel algorithm designed for the fixed budget setting of TLB, providing an efficient solution to optimize sequential decision-making. This algorithm not only offers a theoretical upper bound for estimated loss but also showcases robust performance on both synthetic and real-world datasets. Our contributions highlight the adaptability, simplicity, and computational efficiency of LinearAPT, making it a valuable addition to the toolkit for addressing complex sequential decision-making challenges.

LinearAPT: An Adaptive Algorithm for the Fixed-Budget Thresholding Linear Bandit Problem

TL;DR

Abstract

Paper Structure (25 sections, 4 theorems, 28 equations, 4 figures, 3 tables, 1 algorithm)

This paper contains 25 sections, 4 theorems, 28 equations, 4 figures, 3 tables, 1 algorithm.

Introduction
Related Works
Unstructured Thresholding Bandit Problems
Structured Thresholding Bandit Problems
Level Set Estimation
Problem Definition
Linear Bandit
Thresholding Bandit
Algorithm
LinearAPT
Theoretical Bound
Analysis
Experiments
Setup
Dataset
...and 10 more sections

Key Result

Lemma 1

(NIPS2014_f387624d) For any $\delta > 0$, with probability at least $1-\delta$:

Figures (4)

Figure 1: (a) Uniform Box, $d = 5$
Figure 2: (b) Uniform Box, $d = 20$
Figure 4: (a) Modified version of iris dataset
Figure 5: (b) Modified version of wine dataset

Theorems & Definitions (10)

Lemma 1
proof
Lemma 2
proof
Lemma 3
proof
Theorem 1
proof
proof
proof

LinearAPT: An Adaptive Algorithm for the Fixed-Budget Thresholding Linear Bandit Problem

TL;DR

Abstract

LinearAPT: An Adaptive Algorithm for the Fixed-Budget Thresholding Linear Bandit Problem

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (4)

Theorems & Definitions (10)