Efficient Leverage Score Sampling for Tensor Train Decomposition
Vivek Bharadwaj, Beheshteh T. Rakhshan, Osman Asif Malik, Guillaume Rabusseau
TL;DR
The paper tackles the computational bottleneck of TT-ALS for high-order tensors by introducing rTT-ALS, a sampling-based TT-ALS framework that uses exact leverage-score sampling guided by a data-structure. By maintaining the TT in canonical form, the method makes the left-right TT core chain orthogonal, enabling Φ = I and enabling efficient sampling from the squared-row-norm distribution with construction time $O\left(\sum_{n=1}^j I_n R_{n-1} R_n^2\right)$ and per-sample time $O\left(\sum_{k=1}^j \log\left(I_k R_{k-1}/R_k\right) R_k^2\right)$ (=$O(j R^2 \log I)$ in the uniform case). The authors provide a rigorous proof outline for the sampling procedure and demonstrate empirically that rTT-ALS achieves up to about 16× speedups over non-randomized TT-ALS (and competitive accuracy) on both dense and sparse tensors, including massive real-world datasets. This approach enables scalable TT decompositions for large-scale ML and physics applications and suggests extensions to other tensor-network architectures that exploit canonical forms for efficient sampling.
Abstract
Tensor Train~(TT) decomposition is widely used in the machine learning and quantum physics communities as a popular tool to efficiently compress high-dimensional tensor data. In this paper, we propose an efficient algorithm to accelerate computing the TT decomposition with the Alternating Least Squares (ALS) algorithm relying on exact leverage scores sampling. For this purpose, we propose a data structure that allows us to efficiently sample from the tensor with time complexity logarithmic in the tensor size. Our contribution specifically leverages the canonical form of the TT decomposition. By maintaining the canonical form through each iteration of ALS, we can efficiently compute (and sample from) the leverage scores, thus achieving significant speed-up in solving each sketched least-square problem. Experiments on synthetic and real data on dense and sparse tensors demonstrate that our method outperforms SVD-based and ALS-based algorithms.
