Table of Contents
Fetching ...

Relative Error Tensor Low Rank Approximation

Zhao Song, David P. Woodruff, Peilin Zhong

TL;DR

The paper tackles the problem of relative-error low-rank tensor approximation under the Frobenius norm, addressing the fundamental issue that an exact rank-k solution may not exist and that tensor rank is NP-hard. It introduces bicriteria and fixed-parameter approaches that yield near-optimal relative-error guarantees, using an iterative framework based on flattenings, regression subproblems, and randomized sketches (Gaussian, CountSketch, TensorSketch) to efficiently approximate CP decompositions and CURT-type decompositions. The results extend to a broad family of tensor error measures (ℓ1, ℓp, weighted norms) and to matrix CUR decompositions with input-sparsity time, while also establishing ETH-based hardness and hard instances that justify the need for bicriteria or parameterized strategies. The work thus provides the first scalable, relative-error low-rank approximation tools for tensors across many norms and structural settings, with significant implications for CURT, tensor regression, and large-scale data applications.

Abstract

We consider relative error low rank approximation of $tensors$ with respect to the Frobenius norm: given an order-$q$ tensor $A \in \mathbb{R}^{\prod_{i=1}^q n_i}$, output a rank-$k$ tensor $B$ for which $\|A-B\|_F^2 \leq (1+ε)$OPT, where OPT $= \inf_{\textrm{rank-}k~A'} \|A-A'\|_F^2$. Despite the success on obtaining relative error low rank approximations for matrices, no such results were known for tensors. One structural issue is that there may be no rank-$k$ tensor $A_k$ achieving the above infinum. Another, computational issue, is that an efficient relative error low rank approximation algorithm for tensors would allow one to compute the rank of a tensor, which is NP-hard. We bypass these issues via (1) bicriteria and (2) parameterized complexity solutions: (1) We give an algorithm which outputs a rank $k' = O((k/ε)^{q-1})$ tensor $B$ for which $\|A-B\|_F^2 \leq (1+ε)$OPT in $nnz(A) + n \cdot \textrm{poly}(k/ε)$ time in the real RAM model. Here $nnz(A)$ is the number of non-zero entries in $A$. (2) We give an algorithm for any $δ>0$ which outputs a rank $k$ tensor $B$ for which $\|A-B\|_F^2 \leq (1+ε)$OPT and runs in $ ( nnz(A) + n \cdot \textrm{poly}(k/ε) + \exp(k^2/ε) ) \cdot n^δ$ time in the unit cost RAM model. For outputting a rank-$k$ tensor, or even a bicriteria solution with rank-$Ck$ for a certain constant $C > 1$, we show a $2^{Ω(k^{1-o(1)})}$ time lower bound under the Exponential Time Hypothesis. Our results give the first relative error low rank approximations for tensors for a large number of robust error measures for which nothing was known, as well as column row and tube subset selection. We also obtain new results for matrices, such as $nnz(A)$-time CUR decompositions, improving previous $nnz(A)\log n$-time algorithms, which may be of independent interest.

Relative Error Tensor Low Rank Approximation

TL;DR

The paper tackles the problem of relative-error low-rank tensor approximation under the Frobenius norm, addressing the fundamental issue that an exact rank-k solution may not exist and that tensor rank is NP-hard. It introduces bicriteria and fixed-parameter approaches that yield near-optimal relative-error guarantees, using an iterative framework based on flattenings, regression subproblems, and randomized sketches (Gaussian, CountSketch, TensorSketch) to efficiently approximate CP decompositions and CURT-type decompositions. The results extend to a broad family of tensor error measures (ℓ1, ℓp, weighted norms) and to matrix CUR decompositions with input-sparsity time, while also establishing ETH-based hardness and hard instances that justify the need for bicriteria or parameterized strategies. The work thus provides the first scalable, relative-error low-rank approximation tools for tensors across many norms and structural settings, with significant implications for CURT, tensor regression, and large-scale data applications.

Abstract

We consider relative error low rank approximation of with respect to the Frobenius norm: given an order- tensor , output a rank- tensor for which OPT, where OPT . Despite the success on obtaining relative error low rank approximations for matrices, no such results were known for tensors. One structural issue is that there may be no rank- tensor achieving the above infinum. Another, computational issue, is that an efficient relative error low rank approximation algorithm for tensors would allow one to compute the rank of a tensor, which is NP-hard. We bypass these issues via (1) bicriteria and (2) parameterized complexity solutions: (1) We give an algorithm which outputs a rank tensor for which OPT in time in the real RAM model. Here is the number of non-zero entries in . (2) We give an algorithm for any which outputs a rank tensor for which OPT and runs in time in the unit cost RAM model. For outputting a rank- tensor, or even a bicriteria solution with rank- for a certain constant , we show a time lower bound under the Exponential Time Hypothesis. Our results give the first relative error low rank approximations for tensors for a large number of robust error measures for which nothing was known, as well as column row and tube subset selection. We also obtain new results for matrices, such as -time CUR decompositions, improving previous -time algorithms, which may be of independent interest.

Paper Structure

This paper contains 133 sections, 162 theorems, 729 equations, 18 figures, 42 algorithms.

Key Result

Theorem 1.1

Given a $3$rd order tensor $A \in \mathbb{R}^{n\times n\times n}$, if $A_k$ exists then there is a randomized algorithm running in $\mathop{\mathrm{nnz}}\nolimits(A) + n \cdot \mathop{\mathrm{poly}}\nolimits(k/\epsilon)$ time which outputs a (factorization of a) rank-$O(k^2/\epsilon^2)$ tensor $B$ f

Figures (18)

  • Figure 1: A $3$rd order tensor with size $8\times 8\times 8$.
  • Figure 2: Flattening. We flatten a third order $4 \times 4 \times 4$ tensor along the $1$st dimension to obtain a $4 \times 16$ matrix. The red blocks correspond to a column in the original third order tensor, the blue blocks correspond to a row in the original third order tensor, and the green blocks correspond to a tube in the original third order tensor.
  • Figure 3: A $3$rd order tensor contains $n^2$ columns, $n^2$ rows, and $n^2$ tubes.
  • Figure 4: A third order tensor has three types of faces: the column-row faces, the column-tube faces, and the row-tube faces
  • Figure 5: Column subset selection, row subset selection and tube subset selection.
  • ...and 13 more figures

Theorems & Definitions (369)

  • Theorem 1.1: A Version of Theorem \ref{['thm:f_bicriteria_algorithm_bit']}, bicriteria
  • Theorem 1.2: Combination of Theorem \ref{['thm:f_main_algorithm']} and \ref{['thm:f_main_algorithm_bit']}, rank-$k$
  • Theorem 1.3: Combination of Theorem \ref{['thm:f_curt_algorithm_input_sparsity']} and \ref{['thm:f_curt_algorithm_optimal_samples']}, $\| \|_F$-norm, CURT decomposition
  • Theorem 1.4: Combination of Theorem \ref{['thm:l1_bicriteria_algorithm_rank_k2_nearly_input_sparsity_time']} ($\|\|_1$-norm), Theorem \ref{['thm:lp_bicriteria_algorithm_rank_k2_nearly_input_sparsity_time']} ($\|\|_p$-norm, $p\in (0,1)$) Theorem \ref{['thm:lv_l122_polyklogn_approx_algorithm']} ($\| \|_v$-norm or $\ell_1$-$\ell_2$-$\ell_2$), Theorem \ref{['thm:lu_l112_polyklogn_approx_algorithm']} ($\| \|_u$-norm or $\ell_1$-$\ell_1$-$\ell_2$)
  • Theorem 1.5: Informal Version of Theorem \ref{['thm:w_r_distinct_2d_cols']}, weighted
  • Theorem 1.6: Informal Version of Theorem \ref{['thm:approximate_tensor_rank_is_eth_hard']}
  • Theorem 1.7: Informal Version of Corollary \ref{['cor:two_to_the_one_over_eps_to_the_forth']}
  • Theorem 1.8: Informal Version of Theorem \ref{['thm:f_matrix_cur_algorithm']}, Matrix CUR decomposition
  • Definition A.1: $\otimes$ product for vectors
  • Definition A.2: $\mathop{\mathrm{vec}}\nolimits()$, convert tensor into a vector
  • ...and 359 more