Table of Contents
Fetching ...

New Classes of the Greedy-Applicable Arm Feature Distributions in the Sparse Linear Bandit Problem

Koji Ichikawa, Shinji Ito, Daisuke Hatano, Hanna Sumita, Takuro Fukunaga, Naonori Kakimura, Ken-ichi Kawarabayashi

TL;DR

This work broadens the applicability of greedy-arm selection in sparse linear contextual bandits by relaxing restrictive distributional assumptions. It proves a mixture-closure property: if a component is greedy-applicable, mixtures with it preserve greediness up to a scaling; it then introduces concrete basis classes—Gaussian mixtures, low-rank Gaussian mixtures with discrete limits, and radial bases—that can capture origin-asymmetric supports. These formulations yield theoretical guarantees for greedy-based regret bounds across multiple algorithms, including sparse single-parameter, sparse multi-parameter, and combinatorial settings, without requiring prior knowledge of the sparsity level. The results expand the practical deployment of sparsity-agnostic greedy algorithms to a wider array of arm-feature distributions, including origin-asymmetric and truncated regimes, with implications for recommendations, clinical trials, and high-dimensional contextual decision making.

Abstract

We consider the sparse contextual bandit problem where arm feature affects reward through the inner product of sparse parameters. Recent studies have developed sparsity-agnostic algorithms based on the greedy arm selection policy. However, the analysis of these algorithms requires strong assumptions on the arm feature distribution to ensure that the greedily selected samples are sufficiently diverse; One of the most common assumptions, relaxed symmetry, imposes approximate origin-symmetry on the distribution, which cannot allow distributions that has origin-asymmetric support. In this paper, we show that the greedy algorithm is applicable to a wider range of the arm feature distributions from two aspects. Firstly, we show that a mixture distribution that has a greedy-applicable component is also greedy-applicable. Second, we propose new distribution classes, related to Gaussian mixture, discrete, and radial distribution, for which the sample diversity is guaranteed. The proposed classes can describe distributions with origin-asymmetric support and, in conjunction with the first claim, provide theoretical guarantees of the greedy policy for a very wide range of the arm feature distributions.

New Classes of the Greedy-Applicable Arm Feature Distributions in the Sparse Linear Bandit Problem

TL;DR

This work broadens the applicability of greedy-arm selection in sparse linear contextual bandits by relaxing restrictive distributional assumptions. It proves a mixture-closure property: if a component is greedy-applicable, mixtures with it preserve greediness up to a scaling; it then introduces concrete basis classes—Gaussian mixtures, low-rank Gaussian mixtures with discrete limits, and radial bases—that can capture origin-asymmetric supports. These formulations yield theoretical guarantees for greedy-based regret bounds across multiple algorithms, including sparse single-parameter, sparse multi-parameter, and combinatorial settings, without requiring prior knowledge of the sparsity level. The results expand the practical deployment of sparsity-agnostic greedy algorithms to a wider array of arm-feature distributions, including origin-asymmetric and truncated regimes, with implications for recommendations, clinical trials, and high-dimensional contextual decision making.

Abstract

We consider the sparse contextual bandit problem where arm feature affects reward through the inner product of sparse parameters. Recent studies have developed sparsity-agnostic algorithms based on the greedy arm selection policy. However, the analysis of these algorithms requires strong assumptions on the arm feature distribution to ensure that the greedily selected samples are sufficiently diverse; One of the most common assumptions, relaxed symmetry, imposes approximate origin-symmetry on the distribution, which cannot allow distributions that has origin-asymmetric support. In this paper, we show that the greedy algorithm is applicable to a wider range of the arm feature distributions from two aspects. Firstly, we show that a mixture distribution that has a greedy-applicable component is also greedy-applicable. Second, we propose new distribution classes, related to Gaussian mixture, discrete, and radial distribution, for which the sample diversity is guaranteed. The proposed classes can describe distributions with origin-asymmetric support and, in conjunction with the first claim, provide theoretical guarantees of the greedy policy for a very wide range of the arm feature distributions.
Paper Structure (46 sections, 27 theorems, 123 equations, 1 figure)

This paper contains 46 sections, 27 theorems, 123 equations, 1 figure.

Key Result

Lemma 1

Under Assumption assumption:x_beta_bound, for any $\delta > 0$, the following inequality holds: with probability at least $1 - e^{- \delta^2 /2}$. Here $\epsilon_s$ is the $\sigma$-sub-Gaussian noise in Eq. eq:reward.

Figures (1)

  • Figure 1: Cumulative regret of SA LASSO Bandit algorithm DBLP:conf/icml/OhIZ21 on artificial data that do not satisfy the RS condition. The blue line represents the average of 100 trials, while the orange line shows the function $a + b \sqrt{t}$ fitted with the results for rounds $t=5000$ to $10000$. The shaded area represents the 0.5-$\sigma$ standard deviation region.

Theorems & Definitions (66)

  • Definition 1: Compatibility Condition
  • Lemma 1: Lemma 4 in DBLP:conf/icml/OhIZ21
  • Lemma 2: Corollary 2 in DBLP:conf/icml/OhIZ21
  • Lemma 3
  • Lemma 4
  • Remark 1
  • Remark 2
  • Definition 2
  • Theorem 1
  • Remark 3
  • ...and 56 more