Table of Contents
Fetching ...

Fast and Accurate Interpolative Decompositions for General, Sparse, and Structured Tensors

Yifan Zhang, Mark Fornace, Michael Lindsey

TL;DR

This work addresses scalable, provably accurate tensor decompositions via CoreID and SatID, introducing adaptive sequential CoreID and marginalized SatID to achieve dimension-independent error bounds. It leverages deterministic and random sketching to accelerate matrix ID subproblems, with specialized algorithms for CP and sparse tensors and rigorous error analyses. Empirical results on synthetic CP data and real sparse tensors show that the proposed methods achieve near-optimal reconstruction with substantial speedups compared to prior approaches. The methods have practical impact for large-scale tensor data analysis in science and engineering by enabling structure-preserving, interpretable decompositions at scale.

Abstract

In this work, we develop deterministic and random sketching-based algorithms for two types of tensor interpolative decompositions (ID): the core interpolative decomposition (CoreID, also known as the structure-preserving HOSVD) and the satellite interpolative decomposition (SatID, also known as the HOID or CURT). We adopt a new adaptive approach that leads to ID error bounds independent of the size of the tensor. In addition to the adaptive approach, we use tools from random sketching to enable an efficient and provably accurate calculation of these decompositions. We also design algorithms specialized to tensors that are sparse or given as a sum of rank-one tensors, i.e., in the CP format. Besides theoretical analyses, numerical experiments on both synthetic and real-world data demonstrate the power of the proposed algorithms.

Fast and Accurate Interpolative Decompositions for General, Sparse, and Structured Tensors

TL;DR

This work addresses scalable, provably accurate tensor decompositions via CoreID and SatID, introducing adaptive sequential CoreID and marginalized SatID to achieve dimension-independent error bounds. It leverages deterministic and random sketching to accelerate matrix ID subproblems, with specialized algorithms for CP and sparse tensors and rigorous error analyses. Empirical results on synthetic CP data and real sparse tensors show that the proposed methods achieve near-optimal reconstruction with substantial speedups compared to prior approaches. The methods have practical impact for large-scale tensor data analysis in science and engineering by enabling structure-preserving, interpretable decompositions at scale.

Abstract

In this work, we develop deterministic and random sketching-based algorithms for two types of tensor interpolative decompositions (ID): the core interpolative decomposition (CoreID, also known as the structure-preserving HOSVD) and the satellite interpolative decomposition (SatID, also known as the HOID or CURT). We adopt a new adaptive approach that leads to ID error bounds independent of the size of the tensor. In addition to the adaptive approach, we use tools from random sketching to enable an efficient and provably accurate calculation of these decompositions. We also design algorithms specialized to tensors that are sparse or given as a sum of rank-one tensors, i.e., in the CP format. Besides theoretical analyses, numerical experiments on both synthetic and real-world data demonstrate the power of the proposed algorithms.

Paper Structure

This paper contains 33 sections, 16 theorems, 140 equations, 10 figures, 1 table, 8 algorithms.

Key Result

Theorem 2.2

Consider running alg:basic_cid with target rank $\boldsymbol{k}$, $k_i\geqslant r_i$. Denote $\boldsymbol{\mathcal{T}}^{[\boldsymbol{k}]}$ the reconstructed tensor. Suppose that the matrix ID algorithm is $\Phi$-accurate, and denote for brevity $\Phi_i(\cdot) = \Phi(k_i, r_i, \cdot)$. Suppose withou Here $I$ is the identity map. In particular, if $\Phi_i(x) = C_i x$ for all $i$, then If the matri

Figures (10)

  • Figure 1: Illustration of a 3rd-order CoreID (left) and a SatID (right). Both approximate the tensor on the left-hand side by a low rank Tucker format. In CoreID, the pink core (which need not be contiguous) is selected from the tensor; 3 satellite matrices or 'nodes' are unrestricted and optimized to best approximate the tensor. In SatID, the blue, green, and orange satellite nodes are selected from the tensor (constructed by stacking the vectors selected from the tensor with corresponding colors); the pink core is unrestricted and optimized to best approximate the tensor.
  • Figure 2: Tensor diagram for the sketch operation on sparse tensor $\boldsymbol{\mathcal{T}}$. A connecting edge means contraction. See orus2014practical for more introductory details on tensor diagrams. In the diagram, $\boldsymbol{S}_2$ is a KFJLT or tensor sketch operator, $\boldsymbol{S}_3$ is a count sketch operator, and $\boldsymbol{S}_1$ is a general unstructured sketch operator (e.g. FJLT or Gaussian). See main text for details.
  • Figure 3: Illustration of the marginalization trick in tensor diagram. (a) The tensor being decomposed in this example is given as a CP tensor of order 4. The factors are the 4 circle nodes. (b) We are performing selection for the green mode, adding to $I$ a new column index $\boldsymbol{b}$, which is a triplet indexing the other 3 modes (starting with the orange mode). The diagram represents the matrix $\boldsymbol{Q}_{I^c}^\top \boldsymbol{A}$. The pink triangle represents the $\boldsymbol{Q}_{I^c}$ matrix. (c) To compute the column norms, we need the diagonal of $\boldsymbol{K} = (\boldsymbol{Q}_{I^c}^\top\boldsymbol{A})^\top (\boldsymbol{Q}_{I^c}^\top\boldsymbol{A})$. $\boldsymbol{K}$ is depicted in this diagram. (d) The column norms (i.e, the diagonal of $\boldsymbol{K}$) are represented by the tensor in this diagram. (e) There are $n^3$ scores, and we want to avoid forming them all explicitly. The marginalization trick samples the 3 indices in $\boldsymbol{b}$ one by one (or 'autoregressively') by computing the conditional marginal distribution for each index. We start with the orange index, whose marginal distribution is given by summing over the outgoing legs from the blue nodes in (d). (f) The contraction of the green and the pink nodes gives the grey node, which is efficiently updated every time $I$ gets augmented. The marginal distribution of the orange index in simply the diagonal of the matrix in (f). (g) This can be computed efficiently by inserting a 'stochastic resolution of the identity' $\boldsymbol{S}^\top \boldsymbol{S}$ (where $\boldsymbol{S}$ is a KFJLT sketch matrix, corresponding to the darker blue nodes) along the vertical midline of the diagram, to avoid the $\mathcal{O}(p^2)$ cost where $p$ is the number of rank-1 tensors in the CP tensor. Once the orange index in $\boldsymbol{b}$ is sampled, we repeat this process on 2 other modes (where in each stage the order of the tensor is reduced by 1).
  • Figure 4: Left: Reconstruction error of the CP CoreID algorithm with sketching, including the average error (solid lines) and min-max errors (shaded region) for 4 matrix ID algorithms. On the bottom, the Tucker error is plotted as the optimal low rank approximation error. Middle: Comparison of average reconstruction error for sketched and exact algorithms. Right: Average computation times for sketched and exact algorithms.
  • Figure 5: Left: Reconstruction accuracy of the CP SatID algorithm using marginalization and norm sampling with and without sketch, including the average error (solid lines) and min-max errors (shaded region). On the bottom, the Tucker error is plotted as the optimal low rank approximation error. Right: Average computation times for sketched and exact algorithms, as well as the Tucker decomposition.
  • ...and 5 more figures

Theorems & Definitions (38)

  • Definition 2.1
  • Theorem 2.2
  • proof
  • Remark 2.3
  • Example 2.4
  • Example 2.5
  • Example 2.6
  • Definition 2.7: $$-SE
  • Theorem 2.8
  • proof
  • ...and 28 more