Table of Contents
Fetching ...

Improved Analysis of Khatri-Rao Random Projections and Applications

Arvind K. Saibaba, Bhisham Dev Verma, Grey Ballard

Abstract

Randomization has emerged as a powerful set of tools for large-scale matrix and tensor decompositions. Randomized algorithms involve computing sketches with random matrices. A prevalent approach is to take the random matrix as a standard Gaussian random matrix, for which the theory is well developed. However, this approach has the drawback that the cost of generating and multiplying by the random matrix can be prohibitively expensive. Khatri-Rao random projections (KRPs), obtained by sketching with Khatri-Rao products of random matrices, offer a viable alternative and are much cheaper to generate. However, the theoretical guarantees of using KRPs are much more pessimistic compared to their accuracy observed in practice. We attempt to close this gap by obtaining improved analysis of the use of KRPs in matrix and tensor low-rank decompositions. We propose and analyze a new algorithm for low-rank approximations of block-structured matrices (e.g., block Hankel) using KRPs. We also show how to accelerate tensor computations in the Tucker format using KRPs and give theoretical guarantees of the resulting low-rank approximations. Numerical experiments on synthetic and real-world tensors show the computational benefits of the proposed methods.

Improved Analysis of Khatri-Rao Random Projections and Applications

Abstract

Randomization has emerged as a powerful set of tools for large-scale matrix and tensor decompositions. Randomized algorithms involve computing sketches with random matrices. A prevalent approach is to take the random matrix as a standard Gaussian random matrix, for which the theory is well developed. However, this approach has the drawback that the cost of generating and multiplying by the random matrix can be prohibitively expensive. Khatri-Rao random projections (KRPs), obtained by sketching with Khatri-Rao products of random matrices, offer a viable alternative and are much cheaper to generate. However, the theoretical guarantees of using KRPs are much more pessimistic compared to their accuracy observed in practice. We attempt to close this gap by obtaining improved analysis of the use of KRPs in matrix and tensor low-rank decompositions. We propose and analyze a new algorithm for low-rank approximations of block-structured matrices (e.g., block Hankel) using KRPs. We also show how to accelerate tensor computations in the Tucker format using KRPs and give theoretical guarantees of the resulting low-rank approximations. Numerical experiments on synthetic and real-world tensors show the computational benefits of the proposed methods.

Paper Structure

This paper contains 36 sections, 10 theorems, 70 equations, 4 figures, 1 table, 8 algorithms.

Key Result

Lemma 1

Let $Z_1,\dots, Z_\ell$ be a sequence of independent and identical mean zero random variables with $\|Z_i\|_{L^p} \le (Cp)^d$ for $1\le i \le \ell$, then for $p \ge 2$ where $C'$ is a constant independent of $p,\ell$ but that depends on $C$. $\blacktriangleleft$$\blacktriangleleft$

Figures (4)

  • Figure 1: Left: Average running time over 10 runs for various methods used to recover the system matrices. Right: Average recovery error, measured by the Hausdorff distance between the eigenvalue set of the ground-truth matrix ${\boldsymbol{A}}$ and the eigenvalue set of its reconstruction using R-KRP, R-Dense, and RandERA methods.
  • Figure 1: Plots of relative error versus target ranks for a 4-way Cauchy tensor with $n=250$ and exponents $\alpha = 2$.
  • Figure 1: Left: Average speedup over $10$ runs for different rank values. Right: Average relative error over 10 runs for different rank values.
  • Figure 2: Running time in seconds of the different methods (left panel) on the Cauchy-like tensor averaged over three runs and a breakdown of the running times in different components (right panel). RNG refers to the time of generating pseudorandom numbers, 'Unfolding' refers to the time required for matricization, 'QR' refers to the time for orthogonalizing the sketches, 'mult/KRP' refers to the time for matrix multiplication in the case of Gaussians and MTTKRP for KRP matrices, and 'Core' refers to the time for forming the core tensor.

Theorems & Definitions (20)

  • Lemma 1
  • Proof 1
  • Definition 2: KRP
  • Lemma 3: Isotropic columns
  • Proof 2
  • Lemma 4
  • Theorem 1
  • Lemma 2
  • Proof 3
  • Theorem 3
  • ...and 10 more