A Unified Regularization Approach to High-Dimensional Generalized Tensor Bandits

Jiannan Li; Yiyang Yang; Yao Wang; Shaojie Tang

A Unified Regularization Approach to High-Dimensional Generalized Tensor Bandits

Jiannan Li, Yiyang Yang, Yao Wang, Shaojie Tang

TL;DR

This work tackles high-dimensional, high-order tensor bandits under generalized linear rewards by introducing G-ELTC, a unified framework that regularizes parameter estimation with weakly decomposable norms tailored to tensor structures. Leveraging Tucker decomposition, the authors derive structure-aware regret bounds that improve over existing tensor methods, covering tensor low-rankness, slice sparsity, and extensions to other structures. A parameter-estimation mechanism under GLMs connects estimation error to regret, with a novel analysis approach based on generic chaining. The framework is validated through experiments demonstrating sublinear regret and favorable comparisons to baselines across multiple structured settings, highlighting its practicality for complex, high-dimensional decision problems.

Abstract

Modern decision-making scenarios often involve data that is both high-dimensional and rich in higher-order contextual information, where existing bandits algorithms fail to generate effective policies. In response, we propose in this paper a generalized linear tensor bandits algorithm designed to tackle these challenges by incorporating low-dimensional tensor structures, and further derive a unified analytical framework of the proposed algorithm. Specifically, our framework introduces a convex optimization approach with the weakly decomposable regularizers, enabling it to not only achieve better results based on the tensor low-rankness structure assumption but also extend to cases involving other low-dimensional structures such as slice sparsity and low-rankness. The theoretical analysis shows that, compared to existing low-rankness tensor result, our framework not only provides better bounds but also has a broader applicability. Notably, in the special case of degenerating to low-rank matrices, our bounds still offer advantages in certain scenarios.

A Unified Regularization Approach to High-Dimensional Generalized Tensor Bandits

TL;DR

Abstract

Paper Structure (33 sections, 15 theorems, 65 equations, 5 figures, 1 algorithm)

This paper contains 33 sections, 15 theorems, 65 equations, 5 figures, 1 algorithm.

Introduction
Preliminaries and Notations
Problem Setting
Main Results
The Unified Algorithm
Regret Analysis
Tensor-Wise Low-Rankness
Slice-Wise Sparsity and Low-Rankness
Extension to Other Low-Dimensional Structures
Experiments
Conclution
Related Works
Auxiliary Lemmas
Proof of Theorem \ref{['theo1']}
Proof of Theorem \ref{['theo2']}
...and 18 more sections

Key Result

Theorem 1

If $\lambda_T \geq \alpha R\left(\frac{1}{T} \sum_{t \in [T]} \epsilon_t \mathcal{X}_t\right)$, and Assumption as1-as3 hold, then for any $T \geq c \phi w^2(\boldsymbol{\Theta})$ such that with probability at least $1 -\delta$, where $\alpha=\frac{c_R+3}{2c_R}$, $\boldsymbol{\Theta}=\{\Theta| R(\Theta)\leq 1\}$, and $\|\cdot\|_T^2=\frac{1}{T}\sum_{i \in [T]} \langle \cdot, \mathcal{X}_i \rangle^

Figures (5)

Figure 1: Experimental results of tensor bandits under low multi-linear rankness with different dimensional settings. (a) displays the curve of cumulative regret over time, while (b) shows the variation of the ratio of cumulative regret to the theoretical bound $B_T$ over time.
Figure 2: Experimental results of tensor bandits under slice sparse and low rank structure with different dimensional settings. (a) displays the curve of cumulative regret over time, while (b) shows the variation of the ratio of cumulative regret to the theoretical bound $B_T$ over time.
Figure 3: Experimental results of tensor bandits under low multi-linear rankness with different rank settings. (a) displays the curve of cumulative regret over time, while (b) shows the variation of the ratio of cumulative regret to the theoretical bound $B_T$ over time.
Figure 4: Experimental results of tensor bandits under slice sparse and low rank structure with different rank settings. (a) displays the curve of cumulative regret over time, while (b) shows the variation of the ratio of cumulative regret to the theoretical bound $B_T$ over time.
Figure 5: Comparison of the cumulative regret bounds between the proposed algorithm and DR Lasso under the degenerate Lasso Bandit setting for different $\rho$, $d$, and $s$ configurations.

Theorems & Definitions (36)

Definition 1: Tensor inner product
Definition 2: Tensor Frobenius norm
Definition 3: Tensor mode product
Definition 4: Tucker decomposition
Definition 5: Weak decomposability of norm
Definition 6: Compatibility constant
Definition 7: Gaussian width
Remark 1
Remark 2
Theorem 1: Error for parameter estimation
...and 26 more

A Unified Regularization Approach to High-Dimensional Generalized Tensor Bandits

TL;DR

Abstract

A Unified Regularization Approach to High-Dimensional Generalized Tensor Bandits

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (5)

Theorems & Definitions (36)