Online Tensor Learning: Computational and Statistical Trade-offs, Adaptivity and Optimal Regret

Jingyang Li; Jian-Feng Cai; Yang Chen; Dong Xia

Online Tensor Learning: Computational and Statistical Trade-offs, Adaptivity and Optimal Regret

Jingyang Li, Jian-Feng Cai, Yang Chen, Dong Xia

TL;DR

A unified online Riemannian gradient descent (oRGrad) algorithm for tensor learning, which is computationally efficient, consumes much less memory, and can handle sequentially arriving data while making timely predictions.

Abstract

Large tensor learning algorithms are typically computationally expensive and require storing a vast amount of data. In this paper, we propose a unified online Riemannian gradient descent (oRGrad) algorithm for tensor learning, which is computationally efficient, consumes much less memory, and can handle sequentially arriving data while making timely predictions. The algorithm is applicable to both linear and generalized linear models. If the time horizon T is known, oRGrad achieves statistical optimality by choosing an appropriate fixed step size. We find that noisy tensor completion particularly benefits from online algorithms by avoiding the trimming procedure and ensuring sharp entry-wise statistical error, which is often technically challenging for offline methods. The regret of oRGrad is analyzed, revealing a fascinating trilemma concerning the computational convergence rate, statistical error, and regret bound. By selecting an appropriate constant step size, oRGrad achieves an $O(T^{1/2})$ regret. We then introduce the adaptive-oRGrad algorithm, which can achieve the optimal $O(\log T)$ regret by adaptively selecting step sizes, regardless of whether the time horizon is known. The adaptive-oRGrad algorithm can attain a statistically optimal error rate without knowing the horizon. Comprehensive numerical simulations corroborate our theoretical findings. We show that oRGrad significantly outperforms its offline counterpart in predicting the solar F10.7 index with tensor predictors that monitor space weather impacts.

Online Tensor Learning: Computational and Statistical Trade-offs, Adaptivity and Optimal Regret

TL;DR

Abstract

regret. We then introduce the adaptive-oRGrad algorithm, which can achieve the optimal

regret by adaptively selecting step sizes, regardless of whether the time horizon is known. The adaptive-oRGrad algorithm can attain a statistically optimal error rate without knowing the horizon. Comprehensive numerical simulations corroborate our theoretical findings. We show that oRGrad significantly outperforms its offline counterpart in predicting the solar F10.7 index with tensor predictors that monitor space weather impacts.

Paper Structure (40 sections, 21 theorems, 433 equations, 6 figures, 2 tables, 6 algorithms)

This paper contains 40 sections, 21 theorems, 433 equations, 6 figures, 2 tables, 6 algorithms.

Introduction
Methodology
Background and notations
Generalized low-rank tensor learning
Online Riemannian gradient descent
Online Generalized Tensor Regression
Linear regression
Online initialization
Poisson regression
Online initialization
Online Noisy Tensor Completion
Initialization
Online Binary Tensor Learning
Initialization
Sub-Optimal Regret of oRGrad using Constant Step Size
...and 25 more sections

Key Result

Theorem 1

Suppose Assumptions assump:X-design-assump:GLM-trueT hold, the initialization ${\boldsymbol{\mathcal{T}}}_0\in{\mathbb M}_{{\boldsymbol r}}$ satisfies $\| {\boldsymbol{\mathcal{T}}}_0 - {\boldsymbol{\mathcal{T}}}^* \|_{\rm{F}}\leq c_m\mu_{\alpha}^{-1}\gamma_{\alpha}\lambda_{{\textsf{\tiny min}}}$ fo and the signal strength satisfies where $C_0,\ldots,C_4>0$ are absolute constants, and $c_m, C_m>

Figures (6)

Figure 1: Scree plot for rank selection
Figure 2: Convergence dynamics of oRGrad for online tensor linear regression and completion.
Figure 3: Average per-step runtime versus cubic of dimension
Figure 4: Regret performance. Left: a constant step size; Right: adaptive step sizes.
Figure 5: Example of one slice in the tensor covariate and the response
...and 1 more figures

Theorems & Definitions (38)

Example 1: linear regression
Example 2: logistic regression
Example 3: Poisson regression
Example 4: noisy tensor completion
Example 5: binary tensor learning
Theorem 1
Theorem 2
Theorem 3
Theorem 4
Theorem 5
...and 28 more

Online Tensor Learning: Computational and Statistical Trade-offs, Adaptivity and Optimal Regret

TL;DR

Abstract

Online Tensor Learning: Computational and Statistical Trade-offs, Adaptivity and Optimal Regret

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (6)

Theorems & Definitions (38)