Table of Contents
Fetching ...

Tractable Instances of Bilinear Maximization: Implementing LinUCB on Ellipsoids

Raymond Zhang, Hédi Hadiji, Richard Combes

TL;DR

These findings provide the first known method to implement optimistic algorithms for linear bandits in high dimensions by providing two novel algorithms solving this problem efficiently when $\mathcal{X}$ is a centered ellipsoid.

Abstract

We consider the maximization of $x^\top θ$ over $(x,θ) \in \mathcal{X} \times Θ$, with $\mathcal{X} \subset \mathbb{R}^d$ convex and $Θ\subset \mathbb{R}^d$ an ellipsoid. This problem is fundamental in linear bandits, as the learner must solve it at every time step using optimistic algorithms. We first show that for some sets $\mathcal{X}$ e.g. $\ell_p$ balls with $p>2$, no efficient algorithms exist unless $\mathcal{P} = \mathcal{NP}$. We then provide two novel algorithms solving this problem efficiently when $\mathcal{X}$ is a centered ellipsoid. Our findings provide the first known method to implement optimistic algorithms for linear bandits in high dimensions.

Tractable Instances of Bilinear Maximization: Implementing LinUCB on Ellipsoids

TL;DR

These findings provide the first known method to implement optimistic algorithms for linear bandits in high dimensions by providing two novel algorithms solving this problem efficiently when is a centered ellipsoid.

Abstract

We consider the maximization of over , with convex and an ellipsoid. This problem is fundamental in linear bandits, as the learner must solve it at every time step using optimistic algorithms. We first show that for some sets e.g. balls with , no efficient algorithms exist unless . We then provide two novel algorithms solving this problem efficiently when is a centered ellipsoid. Our findings provide the first known method to implement optimistic algorithms for linear bandits in high dimensions.

Paper Structure

This paper contains 36 sections, 22 theorems, 108 equations, 8 figures, 4 algorithms.

Key Result

Proposition 1

Consider $\varepsilon \in [0,1)$, and either $\mathcal{X} = \{1-\varepsilon, 1\}$ or $\mathcal{X} = [a, b]$ with $a<0<b$. Then there exists an approximate $\varepsilon-$LinUCB algorithm and parameters $\zeta$ with such that $\lim\inf_{T \rightarrow +\infty} R_T(\zeta)/T \geqslant \varepsilon$

Figures (8)

  • Figure 1: Histogram of the entries of the centers $b$ for different dimensions and time.
  • Figure 2: Histogram of the eigenvalues for different dimensions and time.
  • Figure 3: Value vs. time for instances generated from runs of OLSUCB, as a function of $d$ and $\|\zeta\|_{2}$.
  • Figure 4: Running time of MaxNorm and Newton as a function of $\kappa$ for different distributions.
  • Figure 5: Running time of MaxNorm and Newton as a function of $d$ for different distributions.
  • ...and 3 more figures

Theorems & Definitions (39)

  • Proposition 1
  • Proposition 2
  • proof
  • Proposition 3
  • proof
  • Proposition 4
  • Proposition 5
  • proof
  • Theorem 1: Correctness of MaxNorm
  • proof
  • ...and 29 more