Table of Contents
Fetching ...

Efficient Differentiable Approximation of Generalized Low-rank Regularization

Naiqi Li, Yuqiu Xie, Peiyuan Liu, Tao Dai, Yong Jiang, Shu-Tao Xia

TL;DR

This work tackles the optimization bottleneck of low-rank regularization by introducing a differentiable, SVD-free approximation that covers a broad class of relaxations, including the nuclear norm and Schatten-$p$ norms. It defines a stochastic rank surrogate via random projections and leverages Taylor and Laguerre expansions to create differentiable estimators for generalized LRR, enabling gradient-based optimization on standard hardware. Convergence analysis shows that both bias and variance of the rank estimator decline as the sample size $N$ and iteration counts $k_1,k_2$ grow, supporting reliable optimization. Empirically, the method demonstrates versatility across matrix completion, video fore-background separation, and denoising with DNNs, delivering improved performance and efficient GPU-friendly computation. The accompanying code is available at the project GitHub repository.

Abstract

Low-rank regularization (LRR) has been widely applied in various machine learning tasks, but the associated optimization is challenging. Directly optimizing the rank function under constraints is NP-hard in general. To overcome this difficulty, various relaxations of the rank function were studied. However, optimization of these relaxed LRRs typically depends on singular value decomposition, which is a time-consuming and nondifferentiable operator that cannot be optimized with gradient-based techniques. To address these challenges, in this paper we propose an efficient differentiable approximation of the generalized LRR. The considered LRR form subsumes many popular choices like the nuclear norm, the Schatten-$p$ norm, and various nonconvex relaxations. Our method enables LRR terms to be appended to loss functions in a plug-and-play fashion, and the GPU-friendly operations enable efficient and convenient implementation. Furthermore, convergence analysis is presented, which rigorously shows that both the bias and the variance of our rank estimator rapidly reduce with increased sample size and iteration steps. In the experimental study, the proposed method is applied to various tasks, which demonstrates its versatility and efficiency. Code is available at https://github.com/naiqili/EDLRR.

Efficient Differentiable Approximation of Generalized Low-rank Regularization

TL;DR

This work tackles the optimization bottleneck of low-rank regularization by introducing a differentiable, SVD-free approximation that covers a broad class of relaxations, including the nuclear norm and Schatten- norms. It defines a stochastic rank surrogate via random projections and leverages Taylor and Laguerre expansions to create differentiable estimators for generalized LRR, enabling gradient-based optimization on standard hardware. Convergence analysis shows that both bias and variance of the rank estimator decline as the sample size and iteration counts grow, supporting reliable optimization. Empirically, the method demonstrates versatility across matrix completion, video fore-background separation, and denoising with DNNs, delivering improved performance and efficient GPU-friendly computation. The accompanying code is available at the project GitHub repository.

Abstract

Low-rank regularization (LRR) has been widely applied in various machine learning tasks, but the associated optimization is challenging. Directly optimizing the rank function under constraints is NP-hard in general. To overcome this difficulty, various relaxations of the rank function were studied. However, optimization of these relaxed LRRs typically depends on singular value decomposition, which is a time-consuming and nondifferentiable operator that cannot be optimized with gradient-based techniques. To address these challenges, in this paper we propose an efficient differentiable approximation of the generalized LRR. The considered LRR form subsumes many popular choices like the nuclear norm, the Schatten- norm, and various nonconvex relaxations. Our method enables LRR terms to be appended to loss functions in a plug-and-play fashion, and the GPU-friendly operations enable efficient and convenient implementation. Furthermore, convergence analysis is presented, which rigorously shows that both the bias and the variance of our rank estimator rapidly reduce with increased sample size and iteration steps. In the experimental study, the proposed method is applied to various tasks, which demonstrates its versatility and efficiency. Code is available at https://github.com/naiqili/EDLRR.

Paper Structure

This paper contains 17 sections, 23 theorems, 49 equations, 9 figures, 2 tables, 6 algorithms.

Key Result

Proposition 1

For a given matrix $\mathbf{S}\in \mathbb R^{m \times n}$, define the recursive sequence $\mathbf{S}_{i+1}=2 \mathbf{S}_i-\mathbf{S}_i \mathbf{S} \mathbf{S}_i$, with $\mathbf{S}_0=\alpha\mathbf{S}^\top$. Then $\lim_{i \to \infty} \mathbf{S}_i=\mathbf{S}^\dag$, provided $0<\alpha<2/\sigma_1^2(\mathbf

Figures (9)

  • Figure 1: Comparison of matrix completion for text removal. (a) ground truth; (b) image with text; (c)-(j) recovered images.
  • Figure 2: The results using our algorithm in fore-background separation. From top to bottom: original frames of the video, separated backgrounds, and foreground objects.
  • Figure 3: Results of applying low-rank regularization and the proposed differentiable approximation technique in the DnCNN denoising model, measured in PSNR.
  • Figure 4: Approximating the Laplace relaxation function with Laguerre's expansion, with degree 8-10 polynomials. 1) Visualization of the standard degree 0-5 Laguerre polynomials; 2) Visualization of the approximation quality with degree 8-10 polynomials.'
  • Figure 5: Numerical analysis on the synthetic dataset.
  • ...and 4 more figures

Theorems & Definitions (33)

  • Proposition 1: Iterative matrix pseudo-inverse ben1966iterative
  • Proposition 2: NS iteration for matrix square root
  • Proposition 3: Equivalent definition of matrix rank wright2022high
  • Proposition 4
  • Theorem 1
  • Theorem 2
  • Theorem 3
  • Theorem 4
  • Proposition 3: Equivalent definition of matrix rank wright2022high
  • proof
  • ...and 23 more