Fast Exact Leverage Score Sampling from Khatri-Rao Products with Applications to Tensor Decomposition

Vivek Bharadwaj; Osman Asif Malik; Riley Murray; Laura Grigori; Aydin Buluc; James Demmel

Fast Exact Leverage Score Sampling from Khatri-Rao Products with Applications to Tensor Decomposition

Vivek Bharadwaj, Osman Asif Malik, Riley Murray, Laura Grigori, Aydin Buluc, James Demmel

TL;DR

A data structure to randomly sample rows from the Khatri-Rao product of several matrices according to the exact distribution of its leverage scores, which achieves lower asymptotic complexity per solve than recent state-of-the-art methods.

Abstract

We present a data structure to randomly sample rows from the Khatri-Rao product of several matrices according to the exact distribution of its leverage scores. Our proposed sampler draws each row in time logarithmic in the height of the Khatri-Rao product and quadratic in its column count, with persistent space overhead at most the size of the input matrices. As a result, it tractably draws samples even when the matrices forming the Khatri-Rao product have tens of millions of rows each. When used to sketch the linear least squares problems arising in CANDECOMP / PARAFAC tensor decomposition, our method achieves lower asymptotic complexity per solve than recent state-of-the-art methods. Experiments on billion-scale sparse tensors validate our claims, with our algorithm achieving higher accuracy than competing methods as the decomposition rank grows.

Fast Exact Leverage Score Sampling from Khatri-Rao Products with Applications to Tensor Decomposition

TL;DR

Abstract

Paper Structure (42 sections, 6 theorems, 40 equations, 11 figures, 6 tables, 6 algorithms)

This paper contains 42 sections, 6 theorems, 40 equations, 11 figures, 6 tables, 6 algorithms.

Introduction
Preliminaries and Related Work
Notation.
Sketched Linear Least Squares
Leverage Score Sampling.
Prior Work
Khatri-Rao Product Leverage Score Sampling.
Comparison to Woodruff and Zandieh.
Kronecker Regression.
An Efficient Khatri-Rao Leverage Sampler
Efficient Sampling from $q_{h, U, Y}$
Sampling from the Khatri-Rao Product
Application to Tensor Decomposition
Experiments
Runtime Benchmark
...and 27 more sections

Key Result

Theorem 1.1

Given $U_1, ..., U_N$ with $U_j \in \mathbb{R}^{I_j \times R}$, there exists a data structure satisfying the following:

Figures (11)

Figure 1: A segment tree $T_{8,2}$ and probability distribution $\{q_1, ..., q_8\}$ on $\left[ 1, ..., 8 \right]$.
Figure 2: Average time (5 trials) to construct our proposed sampler and draw $J=50,000$ samples from $U_1 \odot ... \odot U_N$, with $U_j \in \mathbb{R}^{I \times R}\ \forall j$. Error bars indicate 3 standard deviations.
Figure 3: Distortion and residual error (50 trials) for varying $R$ and $N$ on least squares, $I=2^{16}, J=5000$. "X" marks indicate outliers 1.5 times the interquartile range beyond the median, stars indicate means.
Figure 4: Average fits (8 trials) achieved by randomized ($J=2^{16}$) and exact ALS for sparse tensor CP decomposition. Error bars indicate 3 standard deviations. See Appendix \ref{['appendix:sparse_cp']} for details.
Figure 5: Average $\varepsilon$ (5 runs) for randomized least squares solves in 10 ALS rounds, $R=50$.
...and 6 more figures

Theorems & Definitions (12)

Theorem 1.1: Efficient Khatri-Rao Product Leverage Sampling
Theorem 2.1: Guarantees for Leverage Score Sampling
Theorem 3.1: Malik 2022, malik_efficient_2022, Adapted
Lemma 3.2: Efficient Row Sampler
Corollary 3.3: STS-CP
proof : Proof of Theorem \ref{['thm:malik2022']}
proof
proof
proof
Corollary A.1: Sparse Input Modification
...and 2 more

Fast Exact Leverage Score Sampling from Khatri-Rao Products with Applications to Tensor Decomposition

TL;DR

Abstract

Fast Exact Leverage Score Sampling from Khatri-Rao Products with Applications to Tensor Decomposition

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (11)

Theorems & Definitions (12)