Table of Contents
Fetching ...

Accelerating Approximate Thompson Sampling with Underdamped Langevin Monte Carlo

Haoyang Zheng, Wei Deng, Christian Moya, Guang Lin

TL;DR

This work proposes an approximate Thompson sampling strategy, utilizing underdamped Langevin Monte Carlo, where the latter is the go-to workhorse for simulations of high-dimensional posteriors.

Abstract

Approximate Thompson sampling with Langevin Monte Carlo broadens its reach from Gaussian posterior sampling to encompass more general smooth posteriors. However, it still encounters scalability issues in high-dimensional problems when demanding high accuracy. To address this, we propose an approximate Thompson sampling strategy, utilizing underdamped Langevin Monte Carlo, where the latter is the go-to workhorse for simulations of high-dimensional posteriors. Based on the standard smoothness and log-concavity conditions, we study the accelerated posterior concentration and sampling using a specific potential function. This design improves the sample complexity for realizing logarithmic regrets from $\mathcal{\tilde O}(d)$ to $\mathcal{\tilde O}(\sqrt{d})$. The scalability and robustness of our algorithm are also empirically validated through synthetic experiments in high-dimensional bandit problems.

Accelerating Approximate Thompson Sampling with Underdamped Langevin Monte Carlo

TL;DR

This work proposes an approximate Thompson sampling strategy, utilizing underdamped Langevin Monte Carlo, where the latter is the go-to workhorse for simulations of high-dimensional posteriors.

Abstract

Approximate Thompson sampling with Langevin Monte Carlo broadens its reach from Gaussian posterior sampling to encompass more general smooth posteriors. However, it still encounters scalability issues in high-dimensional problems when demanding high accuracy. To address this, we propose an approximate Thompson sampling strategy, utilizing underdamped Langevin Monte Carlo, where the latter is the go-to workhorse for simulations of high-dimensional posteriors. Based on the standard smoothness and log-concavity conditions, we study the accelerated posterior concentration and sampling using a specific potential function. This design improves the sample complexity for realizing logarithmic regrets from to . The scalability and robustness of our algorithm are also empirically validated through synthetic experiments in high-dimensional bandit problems.
Paper Structure (28 sections, 36 theorems, 193 equations, 2 figures, 3 tables, 2 algorithms)

This paper contains 28 sections, 36 theorems, 193 equations, 2 figures, 3 tables, 2 algorithms.

Key Result

Theorem 1

Suppose that Assumptions assumption_likelihood and assumption_prior hold, and suppose $x\in \mathbb R^d$ follows SDEs eq_sde2, then for $x_*\in \mathbb R^d$ and $\delta \in \left(0, e^{-0.5}\right)$, the posterior satisfies: with $D=\frac{8d}{\rho }+{2}\log B$, $\Omega=16\kappa^2d+\frac{256}{\rho}$, $\kappa=\frac{L}{m}$ is the condition number, and $B=\max_x \frac{\pi(x)}{\pi(x_*)}$ represents pr

Figures (2)

  • Figure 1: Regret comparisons of TS among different settings.
  • Figure 2: Restaurant example

Theorems & Definitions (69)

  • Theorem 1
  • Theorem 2: Convergence of Underdamped Langevin Monte Carlo
  • Theorem 3
  • Theorem 4
  • Theorem 5
  • Theorem 6
  • proof
  • Lemma 1
  • proof
  • Lemma 2
  • ...and 59 more