Table of Contents
Fetching ...

Linear Submodular Maximization with Bandit Feedback

Wenjing Chen, Victoria G. Crawford

TL;DR

This work tackles submodular maximization under bandit feedback when the objective has a linear structure f(S)=\mathbf{F}(S)^T\mathbf{w} with unknown weights. It introduces two PAC-style algorithms, Linear Greedy (LG) and Linear Threshold Greedy (LinTG), that leverage linear bandit ideas to identify high-gain elements with few noisy queries, achieving guarantees near the classic 1-1/e bound for cardinality constraints. Through adaptive allocation and reuse of past samples, the methods attain substantial sample-efficiency improvements over structure-agnostic approaches, as demonstrated in diversified recommender-system experiments on MovieLens data. The results highlight the practical impact of exploiting linear structure in noisy submodular optimization for scalable, high-quality diverse recommendations and related applications.

Abstract

Submodular optimization with bandit feedback has recently been studied in a variety of contexts. In a number of real-world applications such as diversified recommender systems and data summarization, the submodular function exhibits additional linear structure. We consider developing approximation algorithms for the maximization of a submodular objective function $f:2^U\to\mathbb{R}_{\geq 0}$, where $f=\sum_{i=1}^dw_iF_{i}$. It is assumed that we have value oracle access to the functions $F_i$, but the coefficients $w_i$ are unknown, and $f$ can only be accessed via noisy queries. We develop algorithms for this setting inspired by adaptive allocation algorithms in the best-arm identification for linear bandit, with approximation guarantees arbitrarily close to the setting where we have value oracle access to $f$. Finally, we empirically demonstrate that our algorithms make vast improvements in terms of sample efficiency compared to algorithms that do not exploit the linear structure of $f$ on instances of move recommendation.

Linear Submodular Maximization with Bandit Feedback

TL;DR

This work tackles submodular maximization under bandit feedback when the objective has a linear structure f(S)=\mathbf{F}(S)^T\mathbf{w} with unknown weights. It introduces two PAC-style algorithms, Linear Greedy (LG) and Linear Threshold Greedy (LinTG), that leverage linear bandit ideas to identify high-gain elements with few noisy queries, achieving guarantees near the classic 1-1/e bound for cardinality constraints. Through adaptive allocation and reuse of past samples, the methods attain substantial sample-efficiency improvements over structure-agnostic approaches, as demonstrated in diversified recommender-system experiments on MovieLens data. The results highlight the practical impact of exploiting linear structure in noisy submodular optimization for scalable, high-quality diverse recommendations and related applications.

Abstract

Submodular optimization with bandit feedback has recently been studied in a variety of contexts. In a number of real-world applications such as diversified recommender systems and data summarization, the submodular function exhibits additional linear structure. We consider developing approximation algorithms for the maximization of a submodular objective function , where . It is assumed that we have value oracle access to the functions , but the coefficients are unknown, and can only be accessed via noisy queries. We develop algorithms for this setting inspired by adaptive allocation algorithms in the best-arm identification for linear bandit, with approximation guarantees arbitrarily close to the setting where we have value oracle access to . Finally, we empirically demonstrate that our algorithms make vast improvements in terms of sample efficiency compared to algorithms that do not exploit the linear structure of on instances of move recommendation.
Paper Structure (24 sections, 20 theorems, 96 equations, 2 figures, 3 algorithms)

This paper contains 24 sections, 20 theorems, 96 equations, 2 figures, 3 algorithms.

Key Result

Proposition 1

Let $\hat{\textbf{w}}_t^{\lambda}$ be the solution to the regularized least-squares problem with regularizer $\lambda$ and let $\textbf{A}_t^{\lambda} = \textbf{X}_t^T\textbf{X}_t+\lambda \textbf{I}$. Then for any $N\geq 0$ and every adaptive sequence $\textbf{X}_t$ such that at any step t, $\textbf

Figures (2)

  • Figure 1: The experimental results of running the algorithms on instances of movie recommendation on the subsets of MovieLens 25M dataset with $n=60$, $d=5$ ("movie60") and $n=5000$, $d=30$ ("movie5000"), and different datasets with different values of $d$.
  • Figure 2: The experimental results of running the algorithms on instances of movie recommendation on the subsets of MovieLens 25M dataset with $n=60$, $d=5$ ("movie60") and $n=5000$, $d=30$ ("movien5000"), and different datasets with different value of $d$.

Theorems & Definitions (33)

  • Definition 1: Linear Submodular Maximization with a Cardinality Constraint (SM)
  • Proposition 1
  • Theorem 2
  • Theorem 3
  • Lemma 1
  • proof
  • Theorem 4
  • Lemma 2
  • proof
  • Lemma 3
  • ...and 23 more