Table of Contents
Fetching ...

Matroid Semi-Bandits in Sublinear Time

Ruo-Chun Tzeng, Naoto Ohsaka, Kaito Ariu

TL;DR

This work introduces FasterCUCB, the first matroid semi-bandit algorithm with per-round time sublinear in the number of arms $K$, addressing a key computational bottleneck in large-scale settings. The core idea combines a dynamic maximum-weight-base maintenance routine under inner-product weights with a two-pronged strategy: feature rounding to limit distinct weights and a minimum hitting set over line-arrangement cells to cover multiple queries efficiently. The algorithm achieves sublinear per-round computation for common matroids (uniform, partition, graphical) and near-sublinear time for transversal matroids, while preserving regret guarantees that asymptotically match the gap-dependent lower bound of Kveton et al. (2014). This yields a practically scalable approach to combinatorial bandits with matroid constraints, enabling efficient learning in large action spaces. The results pave the way for extending sublinear-time techniques to related bandit settings and for exploring alternative weight representations in optimistic learning frameworks.

Abstract

We study the matroid semi-bandits problem, where at each round the learner plays a subset of $K$ arms from a feasible set, and the goal is to maximize the expected cumulative linear rewards. Existing algorithms have per-round time complexity at least $Ω(K)$, which becomes expensive when $K$ is large. To address this computational issue, we propose FasterCUCB whose sampling rule takes time sublinear in $K$ for common classes of matroids: $O(D\text{ polylog}(K)\text{ polylog}(T))$ for uniform matroids, partition matroids, and graphical matroids, and $O(D\sqrt{K}\text{ polylog}(T))$ for transversal matroids. Here, $D$ is the maximum number of elements in any feasible subset of arms, and $T$ is the horizon. Our technique is based on dynamic maintenance of an approximate maximum-weight basis over inner-product weights. Although the introduction of an approximate maximum-weight basis presents a challenge in regret analysis, we can still guarantee an upper bound on regret as tight as CUCB in the sense that it matches the gap-dependent lower bound by Kveton et al. (2014a) asymptotically.

Matroid Semi-Bandits in Sublinear Time

TL;DR

This work introduces FasterCUCB, the first matroid semi-bandit algorithm with per-round time sublinear in the number of arms , addressing a key computational bottleneck in large-scale settings. The core idea combines a dynamic maximum-weight-base maintenance routine under inner-product weights with a two-pronged strategy: feature rounding to limit distinct weights and a minimum hitting set over line-arrangement cells to cover multiple queries efficiently. The algorithm achieves sublinear per-round computation for common matroids (uniform, partition, graphical) and near-sublinear time for transversal matroids, while preserving regret guarantees that asymptotically match the gap-dependent lower bound of Kveton et al. (2014). This yields a practically scalable approach to combinatorial bandits with matroid constraints, enabling efficient learning in large action spaces. The results pave the way for extending sublinear-time techniques to related bandit settings and for exploring alternative weight representations in optimistic learning frameworks.

Abstract

We study the matroid semi-bandits problem, where at each round the learner plays a subset of arms from a feasible set, and the goal is to maximize the expected cumulative linear rewards. Existing algorithms have per-round time complexity at least , which becomes expensive when is large. To address this computational issue, we propose FasterCUCB whose sampling rule takes time sublinear in for common classes of matroids: for uniform matroids, partition matroids, and graphical matroids, and for transversal matroids. Here, is the maximum number of elements in any feasible subset of arms, and is the horizon. Our technique is based on dynamic maintenance of an approximate maximum-weight basis over inner-product weights. Although the introduction of an approximate maximum-weight basis presents a challenge in regret analysis, we can still guarantee an upper bound on regret as tight as CUCB in the sense that it matches the gap-dependent lower bound by Kveton et al. (2014a) asymptotically.
Paper Structure (40 sections, 13 theorems, 71 equations, 2 figures, 2 tables, 6 algorithms)

This paper contains 40 sections, 13 theorems, 71 equations, 2 figures, 2 tables, 6 algorithms.

Key Result

Theorem 4.4

There exist implementations of Initialize, Find-Base, and Update-Feature such that the following are satisfied: Find-Base always returns a $(1+\epsilon)$-approximate maximum-weight base of a matroid $\mathcal{M}\xspace$ with arm $k$'s weight defined as $\langle \boldsymbol{f}\xspace_k, \boldsymbol{q

Figures (2)

  • Figure 1: Illustration of feature rounding. There are $|\mathbb{W}|^2$ bins, and features are assumed not to be in (the interior of) the shaded area. Each feature $\boldsymbol{f}\xspace_k$ is rounded to its dominating point $\mathrm{dom}\xspace(\boldsymbol{f}\xspace_k)$, which is specified by a curved arrow.
  • Figure 2: Illustration of characterization of representable permutations. There are three features $\boldsymbol{f}\xspace_1, \boldsymbol{f}\xspace_2, \boldsymbol{f}\xspace_3$ on $\mathbb{R}\xspace^2$. Each dashed line denotes $\overleftrightarrow{\boldsymbol{f}\xspace_{i} \boldsymbol{f}\xspace_{j}}$ for some $i \neq j$; each black bold line is orthogonal to some dashed line and intersects the origin. Such black bold lines generate six regions, each corresponding to a distinct permutation. For example, for any query $\boldsymbol{q}\xspace$ in the hatched area, it holds that $\langle \boldsymbol{f}\xspace_1, \boldsymbol{q}\xspace \rangle > \langle \boldsymbol{f}\xspace_2, \boldsymbol{q}\xspace \rangle > \langle \boldsymbol{f}\xspace_3, \boldsymbol{q}\xspace \rangle$; i.e., $\boldsymbol{q}\xspace$ represents a permutation $\pi$ such that $(\pi(1), \pi(2), \pi(3)) = (1,2,3)$.

Theorems & Definitions (20)

  • Remark 4.3
  • Theorem 4.4: $*$
  • Remark 4.5
  • Lemma 4.6: $*$
  • Lemma 4.7: $*$
  • Lemma 4.8: $*$
  • Lemma 4.9: $*$
  • Corollary 4.10: $*$
  • Theorem 5.1
  • Lemma 5.2
  • ...and 10 more