Accelerating Approximate Thompson Sampling with Underdamped Langevin Monte Carlo

Haoyang Zheng; Wei Deng; Christian Moya; Guang Lin

Accelerating Approximate Thompson Sampling with Underdamped Langevin Monte Carlo

Haoyang Zheng, Wei Deng, Christian Moya, Guang Lin

TL;DR

This work proposes an approximate Thompson sampling strategy, utilizing underdamped Langevin Monte Carlo, where the latter is the go-to workhorse for simulations of high-dimensional posteriors.

Abstract

Approximate Thompson sampling with Langevin Monte Carlo broadens its reach from Gaussian posterior sampling to encompass more general smooth posteriors. However, it still encounters scalability issues in high-dimensional problems when demanding high accuracy. To address this, we propose an approximate Thompson sampling strategy, utilizing underdamped Langevin Monte Carlo, where the latter is the go-to workhorse for simulations of high-dimensional posteriors. Based on the standard smoothness and log-concavity conditions, we study the accelerated posterior concentration and sampling using a specific potential function. This design improves the sample complexity for realizing logarithmic regrets from $\mathcal{\tilde O}(d)$ to $\mathcal{\tilde O}(\sqrt{d})$. The scalability and robustness of our algorithm are also empirically validated through synthetic experiments in high-dimensional bandit problems.

Accelerating Approximate Thompson Sampling with Underdamped Langevin Monte Carlo

TL;DR

This work proposes an approximate Thompson sampling strategy, utilizing underdamped Langevin Monte Carlo, where the latter is the go-to workhorse for simulations of high-dimensional posteriors.

Abstract

. The scalability and robustness of our algorithm are also empirically validated through synthetic experiments in high-dimensional bandit problems.

Paper Structure (28 sections, 36 theorems, 193 equations, 2 figures, 3 tables, 2 algorithms)

This paper contains 28 sections, 36 theorems, 193 equations, 2 figures, 3 tables, 2 algorithms.

INTRODUCTION
RELATED WORKS
PROBLEM SETTING
POSTERIOR ANALYSIS
Continuous-Time Diffusion Analysis
Discrete-Time Dynamics Analysis
REGRET ANALYSIS
Regrets for Exact Thompson Sampling
Regrets for Approximate Thompson Sampling
EXPERIMENTS
CONCLUSIONS AND DISCUSSIONS
INTRODUCTION TO THOMPSON SAMPLING
PROOF OUTLINE
ANALYSIS OF EXACT THOMPSON SAMPLING
Notation
...and 13 more sections

Key Result

Theorem 1

Suppose that Assumptions assumption_likelihood and assumption_prior hold, and suppose $x\in \mathbb R^d$ follows SDEs eq_sde2, then for $x_*\in \mathbb R^d$ and $\delta \in \left(0, e^{-0.5}\right)$, the posterior satisfies: with $D=\frac{8d}{\rho }+{2}\log B$, $\Omega=16\kappa^2d+\frac{256}{\rho}$, $\kappa=\frac{L}{m}$ is the condition number, and $B=\max_x \frac{\pi(x)}{\pi(x_*)}$ represents pr

Figures (2)

Figure 1: Regret comparisons of TS among different settings.
Figure 2: Restaurant example

Theorems & Definitions (69)

Theorem 1
Theorem 2: Convergence of Underdamped Langevin Monte Carlo
Theorem 3
Theorem 4
Theorem 5
Theorem 6
proof
Lemma 1
proof
Lemma 2
...and 59 more

Accelerating Approximate Thompson Sampling with Underdamped Langevin Monte Carlo

TL;DR

Abstract

Accelerating Approximate Thompson Sampling with Underdamped Langevin Monte Carlo

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (2)

Theorems & Definitions (69)