Table of Contents
Fetching ...

Sparse Linear Bandits with Blocking Constraints

Adit Jain, Soumyabrata Pal, Sunav Choudhary, Ramasuri Narayanam, Harshita Chopra, Vikram Krishnamurthy

TL;DR

This work tackles high-dimensional sparse linear bandits under a blocking constraint where each arm can be pulled at most once, a setting representative of data-poor, edge and annotation-efficient tasks. It introduces BSLB, an explore-then-commit algorithm that selects a well-covered arm subset to ensure a well-conditioned Gram matrix, followed by Lasso estimation and exploitation under the blocking constraint; a Corralling-based meta-algorithm C-BSLB removes the need to know the sparsity level in advance. Theoretical contributions include offline Lasso guarantees under the RE condition with soft sparsity, a subset-selection procedure with provable eigenvalue guarantees, and online regret bounds of the form $\tilde{O}((1+\beta_k)^2 k^{2/3} T^{2/3})$, matching special cases and extending to unknown sparsity via corralling. Empirically, the approach demonstrates strong performance on diverse real-world datasets such as MovieLens, Jester, Goodbooks, PASCAL VOC 2012, and SST-2, highlighting its practical impact for personalized recommendations and data-efficient annotation. Overall, the paper provides a coherent framework for blocked high-dimensional bandits with sparse structure, coupling refined statistical guarantees with scalable online algorithms.

Abstract

We investigate the high-dimensional sparse linear bandits problem in a data-poor regime where the time horizon is much smaller than the ambient dimension and number of arms. We study the setting under the additional blocking constraint where each unique arm can be pulled only once. The blocking constraint is motivated by practical applications in personalized content recommendation and identification of data points to improve annotation efficiency for complex learning tasks. With mild assumptions on the arms, our proposed online algorithm (BSLB) achieves a regret guarantee of $\widetilde{\mathsf{O}}((1+β_k)^2k^{\frac{2}{3}} \mathsf{T}^{\frac{2}{3}})$ where the parameter vector has an (unknown) relative tail $β_k$ -- the ratio of $\ell_1$ norm of the top-$k$ and remaining entries of the parameter vector. To this end, we show novel offline statistical guarantees of the lasso estimator for the linear model that is robust to the sparsity modeling assumption. Finally, we propose a meta-algorithm (C-BSLB) based on corralling that does not need knowledge of optimal sparsity parameter $k$ at minimal cost to regret. Our experiments on multiple real-world datasets demonstrate the validity of our algorithms and theoretical framework.

Sparse Linear Bandits with Blocking Constraints

TL;DR

This work tackles high-dimensional sparse linear bandits under a blocking constraint where each arm can be pulled at most once, a setting representative of data-poor, edge and annotation-efficient tasks. It introduces BSLB, an explore-then-commit algorithm that selects a well-covered arm subset to ensure a well-conditioned Gram matrix, followed by Lasso estimation and exploitation under the blocking constraint; a Corralling-based meta-algorithm C-BSLB removes the need to know the sparsity level in advance. Theoretical contributions include offline Lasso guarantees under the RE condition with soft sparsity, a subset-selection procedure with provable eigenvalue guarantees, and online regret bounds of the form , matching special cases and extending to unknown sparsity via corralling. Empirically, the approach demonstrates strong performance on diverse real-world datasets such as MovieLens, Jester, Goodbooks, PASCAL VOC 2012, and SST-2, highlighting its practical impact for personalized recommendations and data-efficient annotation. Overall, the paper provides a coherent framework for blocked high-dimensional bandits with sparse structure, coupling refined statistical guarantees with scalable online algorithms.

Abstract

We investigate the high-dimensional sparse linear bandits problem in a data-poor regime where the time horizon is much smaller than the ambient dimension and number of arms. We study the setting under the additional blocking constraint where each unique arm can be pulled only once. The blocking constraint is motivated by practical applications in personalized content recommendation and identification of data points to improve annotation efficiency for complex learning tasks. With mild assumptions on the arms, our proposed online algorithm (BSLB) achieves a regret guarantee of where the parameter vector has an (unknown) relative tail -- the ratio of norm of the top- and remaining entries of the parameter vector. To this end, we show novel offline statistical guarantees of the lasso estimator for the linear model that is robust to the sparsity modeling assumption. Finally, we propose a meta-algorithm (C-BSLB) based on corralling that does not need knowledge of optimal sparsity parameter at minimal cost to regret. Our experiments on multiple real-world datasets demonstrate the validity of our algorithms and theoretical framework.

Paper Structure

This paper contains 34 sections, 17 theorems, 79 equations, 13 figures, 3 tables, 4 algorithms.

Key Result

Theorem 1

Let $\mathbf{X} \in \mathbb{R}^{n\times d}$ be the data matrix satisfying $|\mathbf{X}_{ij}| \leq 1 \forall i,j$. Let $\mathbf{r} \in \mathbb{R}^{n}$ be the corresponding observations such that $\mathbf{r} = \mathbf{X}\boldsymbol{\theta}+\boldsymbol{\mathbf{\eta}}$, where $\boldsymbol{\mathbf{\eta}}

Figures (13)

  • Figure 1: Simulation illustrating performance gap between our proposed algorithm BSLB and naive extensions of ESTC, LinUCB and DR-Lasso to incorporate blocking constraint. We consider an instance with $\mathsf{M} = 500$ arms ($l=5$ arms of unit norm and remaining arms $\ell_2$ norm of 0.5), $d = 100$, $\mathsf{T}=80$ and $k=5$.
  • Figure 2: Regret of different algorithms in a Simulated Blocked Sparse Linear Bandit Setup.
  • Figure 3: Numerical experiment on MovieLens illustrating performance gap between our proposed algorithm BSLB and naive extensions of LinUCB and DR-Lasso to incorporate blocking constraint. The performance of extended ESTC remains competitive.
  • Figure 4: Numerical experiment on Netflix dataset illustrating performance gap between our proposed algorithm BSLB and naive extensions of LinUCB and DR-Lasso to incorporate blocking constraint. The performance of extended ESTC remains competitive.
  • Figure 5: Cumulative Regret for recommendation using only single ratings using BSLB with different exploration periods and when run with CORRAL agarwal2017corralling in Books Dataset.
  • ...and 8 more figures

Theorems & Definitions (33)

  • Definition 1
  • Theorem 1
  • Remark 1
  • Corollary 1
  • Remark 2
  • Theorem 2
  • Remark 3
  • Theorem 3
  • Remark 4
  • Theorem 4
  • ...and 23 more