Table of Contents
Fetching ...

Local Anti-Concentration Class: Logarithmic Regret for Greedy Linear Contextual Bandit

Seok-Jin Kim, Min-hwan Oh

TL;DR

This paper investigates exploration-free greedy policies for linear contextual bandits under stochastic contexts by introducing Local Anti-Concentration (LAC), a condition that broadens the distributional class supporting efficient greedy learning. It proves that distributions satisfying LAC—encompassing Gaussian, exponential, uniform, Cauchy, Student's $t$, and truncated variants—lead to a cumulative regret of $O( ext{poly}\log T)$ for LinGreedy, with $ ilde{O}(d^{2.5})$-type dependence in unbounded settings and $\sqrt{t}$-consistency of the parameter estimator. The core technical advances are twofold: (i) establishing a positive diversity constant ensuring growth of the adapted Gram matrix, and (ii) bounding the suboptimality gap probabilistically via a margin constant, both derived under the LAC framework rather than assumed. Empirical results across several light-tailed and heavy-tailed distributions corroborate the theoretical findings, demonstrating strong performance of LinGreedy relative to exploration-based baselines. This work broadens the practical applicability of greedy bandits and provides sharp poly-log regret guarantees for a wide range of context distributions.

Abstract

We study the performance guarantees of exploration-free greedy algorithms for the linear contextual bandit problem. We introduce a novel condition, named the \textit{Local Anti-Concentration} (LAC) condition, which enables a greedy bandit algorithm to achieve provable efficiency. We show that the LAC condition is satisfied by a broad class of distributions, including Gaussian, exponential, uniform, Cauchy, and Student's~$t$ distributions, along with other exponential family distributions and their truncated variants. This significantly expands the class of distributions under which greedy algorithms can perform efficiently. Under our proposed LAC condition, we prove that the cumulative expected regret of the greedy algorithm for the linear contextual bandit is bounded by $O(\operatorname{poly} \log T)$. Our results establish the widest range of distributions known to date that allow a sublinear regret bound for greedy algorithms, further achieving a sharp poly-logarithmic regret.

Local Anti-Concentration Class: Logarithmic Regret for Greedy Linear Contextual Bandit

TL;DR

This paper investigates exploration-free greedy policies for linear contextual bandits under stochastic contexts by introducing Local Anti-Concentration (LAC), a condition that broadens the distributional class supporting efficient greedy learning. It proves that distributions satisfying LAC—encompassing Gaussian, exponential, uniform, Cauchy, Student's , and truncated variants—lead to a cumulative regret of for LinGreedy, with -type dependence in unbounded settings and -consistency of the parameter estimator. The core technical advances are twofold: (i) establishing a positive diversity constant ensuring growth of the adapted Gram matrix, and (ii) bounding the suboptimality gap probabilistically via a margin constant, both derived under the LAC framework rather than assumed. Empirical results across several light-tailed and heavy-tailed distributions corroborate the theoretical findings, demonstrating strong performance of LinGreedy relative to exploration-based baselines. This work broadens the practical applicability of greedy bandits and provides sharp poly-log regret guarantees for a wide range of context distributions.

Abstract

We study the performance guarantees of exploration-free greedy algorithms for the linear contextual bandit problem. We introduce a novel condition, named the \textit{Local Anti-Concentration} (LAC) condition, which enables a greedy bandit algorithm to achieve provable efficiency. We show that the LAC condition is satisfied by a broad class of distributions, including Gaussian, exponential, uniform, Cauchy, and Student's~ distributions, along with other exponential family distributions and their truncated variants. This significantly expands the class of distributions under which greedy algorithms can perform efficiently. Under our proposed LAC condition, we prove that the cumulative expected regret of the greedy algorithm for the linear contextual bandit is bounded by . Our results establish the widest range of distributions known to date that allow a sublinear regret bound for greedy algorithms, further achieving a sharp poly-logarithmic regret.

Paper Structure

This paper contains 141 sections, 46 theorems, 228 equations, 19 figures, 1 table, 1 algorithm.

Key Result

proposition 1

Suppose the random variable $X = (X_1, X_2)$, where $X_1 \in \RR^{n_1}$ and $X_2 \in \RR^{n_2}$, consists of two independent components. If $X_1$ and $X_2$ satisfy the LAC condition with functions $\la_1(\cdot)$ and $\la_2(\cdot)$, respectively, then $X$ satisfies the LAC condition with $\la(x) = \m

Figures (19)

  • Figure 1: The cumulative regret plots of the numerical experiments. The full results are available in Appendix \ref{['appendix; experiments']}.
  • Figure 2: Illustration of expanding section's example. The section with direction $v$ is expanding when $y$ increases: $y$ to $y+h$.
  • Figure 3: Illustration of $A_1$ and expanding sections of $v$ or $-v$. $A_1$ is the area above the green line. In this case, sections with direction $+v$ are expanding! If a cylindrical set is cut by some hyperplane (which is $A_1$), at least one direction makes expanding sections.
  • Figure 4: Illustration of $A_1$ and expanding sections of $v$ or $-v$. $A_1$ is the area above the green line. In this case, sections with direction $+v$ are expanding. If a cylindrical set is cut by some hyperplane, at least one direction produces expanding sections.
  • Figure 5: Illustration of $S$ and expanding sections with direction $e_n$. Blue lines are sections $\Sec(S',e_n,y)$. If a cylindrical-shaped set is sliced by some hyperplane, at least one direction forms expanding sections.
  • ...and 14 more figures

Theorems & Definitions (99)

  • definition 1: Local Anti-Concentration (LAC)
  • proposition 1
  • definition 2: Diversity Constant
  • theorem 1: Regret bound of LinGreedy
  • theorem 2: Diversity constant for unbounded contexts
  • theorem 3: Suboptimality gap for unbounded contexts
  • lemma 1: LAC of conditional contexts
  • lemma 2: LAC with truncation
  • proof
  • definition 3: Decay rate
  • ...and 89 more