A Practical Mode-parallel Implementation of the (H-)Tucker Decomposition via Randomization

Martina Iannacito; Sascha Portaro; Davide Palitta; Claudio Arlandini; Domitilla Brandoni

A Practical Mode-parallel Implementation of the (H-)Tucker Decomposition via Randomization

Martina Iannacito, Sascha Portaro, Davide Palitta, Claudio Arlandini, Domitilla Brandoni

Abstract

In the last decades, tensors have emerged as the right tool to represent multidimensional data in a compact yet informative manner. Moreover, it is well-known that by performing low-rank factorizations of such tensors one is often able to effectively unveil possible hidden structure in data, mainly due to unexpected dependencies among the different variables encoded in the given tensor. However, computing these factorizations is extremely energy-consuming and memory-demanding, especially for high-dimensional tensors, namely those with a large number of modes. In this paper we focus on two state-of-the-art tensor decompositions: the Tucker and H-Tucker decompositions. We propose novel numerical strategies able to perform these factorizations in a \emph{mode-parallel} fashion, that is the operations required by the algorithm along all modes are performed in parallel. This is in contrast to what is achieved by many procedures available in the literature that parallelize some of the operations along each mode, e.g., tensor-times-matrix steps, while still visiting one mode at the time in a sequential manner. Our strategies make use of cutting-edge randomization techniques comprising fiber sampling and randomized range-finding steps. We provide upper bounds on the expected value of the error provided by our factorizations while a panel of numerical results showcases the potential of our approach in reducing both the running time and the storage demand of the whole procedure. Moreover, experiments carried out in HPC environments illustrate the good scaling of our mode-parallel approach.

A Practical Mode-parallel Implementation of the (H-)Tucker Decomposition via Randomization

Abstract

Paper Structure (19 sections, 5 theorems, 44 equations, 7 figures, 4 algorithms)

This paper contains 19 sections, 5 theorems, 44 equations, 7 figures, 4 algorithms.

Introduction
Preliminaries
Notation
Tensor basics
Randomized range-finder
The Tucker decomposition
Subsampled randomized HOSVD
Error bounds
Numerical results
Synthetic tests
Real-world examples
Parallel experiment
The H-Tucker decomposition
Subsampled randomized RtL-HT
Numerical experiments
...and 4 more sections

Key Result

Lemma 3.1

\newlabellemma_fullrank0 Let ${\mathbf{X}}={\mathbf{U\Sigma V}}^T$ with ${\mathbf{V}}=[{\mathbf{V}}_1,{\mathbf{V}}_2]$, ${\mathbf{V}}_1\in \mathbb{R}^{m \times r}$, ${\mathbf{V}}_2\in\mathbb{R}^{m \times (n-r)}$ as in partition. Define the quantities $M_i := m\cdot\mu({\mathbf{V}}_i)$, $i=1,2$. Se with failure probability at most $r \cdot \left ( \frac{\mathtt e^{-\delta}}{(1-\delta)^{1-\delta}}

Figures (7)

Figure 1: Parallel performance of Sub-R-HOSVD. The dashed line with slope 1 denotes the ideal linear speed-up.
Figure 1: Dimension tree for an order-$8$ tensor with transfer tensors enumerated by heap indexing.
Figure 2: Parallel performance of Sub-R-HOSVD with a parallel computation of the indeces for the fiber sampling. The dashed line with slope 1 denotes the ideal linear speed-up.
Figure 2: Comparison of the considered methods on synthetic tensors with i.i.d. entries drawn from a standard normal distribution. All the randomized algorithms are tested over $25$ independent runs.
Figure 3: Parallel performance of Sub-R-LtR-HT using $1$, $2$, $4$ and $8$ processes. The dashed line with slope 1 denotes the ideal linear speed-up.
...and 2 more figures

Theorems & Definitions (13)

Definition 2.1
Definition 2.2
Lemma 3.1
Proof 1
Lemma 3.2
Proof 2
Theorem 3.3
Proof 3
Theorem 3.4
Proof 4
...and 3 more

A Practical Mode-parallel Implementation of the (H-)Tucker Decomposition via Randomization

Abstract

A Practical Mode-parallel Implementation of the (H-)Tucker Decomposition via Randomization

Authors

Abstract

Table of Contents

Key Result

Figures (7)

Theorems & Definitions (13)