Learnable Scaled Gradient Descent for Guaranteed Robust Tensor PCA

Lanlan Feng; Ce Zhu; Yipeng Liu; Saiprasad Ravishankar; Longxiu Huang

Learnable Scaled Gradient Descent for Guaranteed Robust Tensor PCA

Lanlan Feng, Ce Zhu, Yipeng Liu, Saiprasad Ravishankar, Longxiu Huang

TL;DR

This work introduces RTPCA-SGD, a scalable, tensor-SVD-based robust PCA method that factors the low-rank component as X = L * R^T and alternates with a sparsity-promoting S, avoiding costly full t-SVDs per iteration. The authors prove exact recovery and linear convergence with a rate independent of the condition number κ under mild μ-incoherence and α-sparsity assumptions, with thresholding controlled by a decaying parameter sequence and a fixed step size. A learnable, self-supervised deep unfolding model (RTPCA-LSGD) is proposed to adapt the four critical parameters (ζ_0, ζ_1, τ, η), enhancing practical performance without ground-truth data. Experiments on synthetic and real data (video denoising and background initialization) show RTPCA-SGD outperforms TNN-based RTPCA while offering competitive runtimes, and RTPCA-LSGD yields further gains, validating both the theory and the practical utility of the approach. Overall, the paper delivers a scalable, provably reliable tensor PCA framework with a principled pathway to learnable parameter optimization.

Abstract

Robust tensor principal component analysis (RTPCA) aims to separate the low-rank and sparse components from multi-dimensional data, making it an essential technique in the signal processing and computer vision fields. Recently emerging tensor singular value decomposition (t-SVD) has gained considerable attention for its ability to better capture the low-rank structure of tensors compared to traditional matrix SVD. However, existing methods often rely on the computationally expensive tensor nuclear norm (TNN), which limits their scalability for real-world tensors. To address this issue, we explore an efficient scaled gradient descent (SGD) approach within the t-SVD framework for the first time, and propose the RTPCA-SGD method. Theoretically, we rigorously establish the recovery guarantees of RTPCA-SGD under mild assumptions, demonstrating that with appropriate parameter selection, it achieves linear convergence to the true low-rank tensor at a constant rate, independent of the condition number. To enhance its practical applicability, we further propose a learnable self-supervised deep unfolding model, which enables effective parameter learning. Numerical experiments on both synthetic and real-world datasets demonstrate the superior performance of the proposed methods while maintaining competitive computational efficiency, especially consuming less time than RTPCA-TNN.

Learnable Scaled Gradient Descent for Guaranteed Robust Tensor PCA

TL;DR

Abstract

Paper Structure (33 sections, 16 theorems, 107 equations, 10 figures, 2 tables, 1 algorithm)

This paper contains 33 sections, 16 theorems, 107 equations, 10 figures, 2 tables, 1 algorithm.

Introduction
Related works
Notation and Preliminaries
Methods
Model
Algorithm
Spectral initialization
Scaled gradient updates
Parameter learning
Computational complexity
Theoretical Results
Experiments
Synthetic data
Phase transition
Linear convergence rate
...and 18 more sections

Key Result

Theorem 1

Suppose that $\mathcal{X}_{\star}$ with tubal-rank $R$ satisfies tensor $\mu$-incoherence conditions (i.e., Assumption as:incoherence), $\mathcal{S}_{\star}$ is an $\alpha$-sparse tensor (i.e., Assumption as:sparsity) with $\alpha \leq \frac{1}{10^4 \mu R^{1.5} {I_3}^{1.5} \kappa}$. If we set the th and

Figures (10)

Figure 1: Illustration of RTPCA model.
Figure 2: Illustration of t-SVD framework.
Figure 3: Network architecture. The observed tensor $\mathcal{Y}$ and the tubal rank $R$ serve as inputs to the first layer, performing the spectral initialization step described in Algorithm \ref{['Alg: LGRTPCA1']}. The subsequent recurrent layers implement the iterative updates for $\mathcal{X}_{k} = \mathcal{L}_{k} * \mathcal{R}_{k}^{\top}$ and $\mathcal{S}_{k}$, following Algorithm \ref{['Alg: LGRTPCA1']}. These layers require the iteration number $k$, as it determines the $(k + 1)$-th thresholding value parameter, defined as $\zeta_{k+1} = \tau^{k} \zeta_{1}$ for $k \geq 0$.
Figure 4: Comparison of tensor sparsity and matrix sparsity. Blue boxes denote outlier entries, while white boxes indicate zero entries. The matrix on the right is obtained by unfolding the tensor on the left along the 1st mode.
Figure 5: Phase transition performance of RTPCA-SGD under different condition numbers. Here $I_1 = I_2 = 100, I_3 = 50$.
...and 5 more figures

Theorems & Definitions (56)

Definition 1
Definition 2: Tensor Frobenius norm
Definition 3: Tensor $\ell_{2,\infty}$ norm
Definition 4: Tensor $\ell_{1,\infty}$ norm
Definition 5: Tensor infinity norm
Definition 6: Conjugate transpose
Definition 7: Identity tensor
Definition 8: Inverse tensor
Definition 9: Orthogonal tensor
Definition 10: F-diagonal tensor
...and 46 more

Learnable Scaled Gradient Descent for Guaranteed Robust Tensor PCA

TL;DR

Abstract

Learnable Scaled Gradient Descent for Guaranteed Robust Tensor PCA

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (10)

Theorems & Definitions (56)