A Random Matrix Approach to Low-Multilinear-Rank Tensor Approximation

Hugo Lebeau; Florent Chatelain; Romain Couillet

A Random Matrix Approach to Low-Multilinear-Rank Tensor Approximation

Hugo Lebeau, Florent Chatelain, Romain Couillet

TL;DR

This work analyzes the problem of recovering a planted low-multilinear-rank tensor from a noisy observation in a general spiked tensor model, focusing on the regime near the computational threshold. It leverages classical random matrix theory to study the spectral behavior of tensor unfoldings, proving that, after centering and scaling, their eigenvalue distributions converge to a semicircle with a BBP-like isolated eigenvalue appearing when the mode-specific signal-to-noise ratios exceed one. These spectral insights are used to quantify the reconstruction performance of truncated MLSVD, with precise asymptotics for subspace alignments and phase transitions for each unfolded mode. Moreover, the paper shows that initializing the Higher-Order Orthogonal Iteration with MLSVD results in convergence to the maximum-likelihood solution in a single iteration in the large-$N$ limit, thereby clarifying the practical computational-to-statistical limits and the pivotal role of initialization in tensor recovery.

Abstract

This work presents a comprehensive understanding of the estimation of a planted low-rank signal from a general spiked tensor model near the computational threshold. Relying on standard tools from the theory of large random matrices, we characterize the large-dimensional spectral behavior of the unfoldings of the data tensor and exhibit relevant signal-to-noise ratios governing the detectability of the principal directions of the signal. These results allow to accurately predict the reconstruction performance of truncated multilinear SVD (MLSVD) in the non-trivial regime. This is particularly important since it serves as an initialization of the higher-order orthogonal iteration (HOOI) scheme, whose convergence to the best low-multilinear-rank approximation depends entirely on its initialization. We give a sufficient condition for the convergence of HOOI and show that the number of iterations before convergence tends to $1$ in the large-dimensional limit.

A Random Matrix Approach to Low-Multilinear-Rank Tensor Approximation

TL;DR

limit, thereby clarifying the practical computational-to-statistical limits and the pivotal role of initialization in tensor recovery.

Abstract

in the large-dimensional limit.

Paper Structure (20 sections, 8 theorems, 29 equations, 4 figures, 1 algorithm)

This paper contains 20 sections, 8 theorems, 29 equations, 4 figures, 1 algorithm.

Introduction
Low-Rank Tensor Estimation
Related Work
Summary of Contributions
Preliminaries on Tensors and Random Matrix Theory
General Notations
Tensors, Related Operations and Decompositions
Canonical Polyadic Decomposition (CPD).
Multilinear Singular Value Decomposition (MLSVD).
Tools from Random Matrix Theory
Analysis of Truncated MLSVD under the General Spiked Tensor Model
Random Matrix Results on the Model
Reconstruction Performance of Truncated MLSVD
Numerical Estimation of the Best Low-Multilinear-Rank Approximation
Higher-Order Orthogonal Iteration
...and 5 more sections

Key Result

Lemma 4

Let $Z \sim {\mathcal{N}}(0, 1)$ and $f : {\mathbb{R}} \to {\mathbb{C}}$ be a continuously differentiable function. When the following expectations exist, $\mathbb{E}[Z f(Z)] = \mathbb{E}[f'(Z)]$.

Figures (4)

Figure 1: Illustration of the Tucker decomposition \ref{['eq:P_decompositon']} of an $n_1 \times n_2 \times n_3$ tensor ${\bm{\mathscr{P}}}$ with multilinear rank $(r_1, r_2, r_3)$. ${\bm{\mathscr{H}}}$ is the $r_1 \times r_2 \times r_3$ core tensor and ${\bm{X}}^{(1)}, {\bm{X}}^{(2)}, {\bm{X}}^{(3)}$ are matrices with orthonormal columns spanning the singular subspaces of ${\bm{\mathscr{P}}}$.
Figure 2: Alignments between singular subspaces (see Section \ref{['sec:analysis:reconstruction']}) of the observation ${\bm{\mathscr{T}}} = \sqrt{\omega} {\bm{\mathscr{P}}}_\circ + \frac{1}{\sqrt{N}} {\bm{\mathscr{N}}}$ and of the signal ${\bm{\mathscr{P}}}_\circ$, with $\lVert {\bm{\mathscr{P}}}_\circ \rVert_{\mathrm{F}}^2 = \frac{\sqrt{n_1 n_2 n_3}}{N}$, as a function of the signal-to-noise ratio $\omega$. Theoretical alignments (Theorem \ref{['thm:spike']}) achieved with truncated MLSVD are compared with simulations and those achieved with the HOOI algorithm. Empirical results are averaged over $10$ trials, with error bars representing standard deviation. Experimental setting:$d = 3$, $(n_1, n_2, n_3) = (100, 200, 300)$, $N = n_1 + n_2 + n_3$ and $(r_1, r_2, r_3) = (3, 4, 5)$.
Figure 3: Top: empirical spectral distribution (ESD) of ${\bm{T}}^{(\ell)} {\bm{T}}^{(\ell) \top}$. The orange curve is the density of the stretched semicircle on $[\mu^{(\ell)}_N \pm 2 \sigma_N]$ (Corollary \ref{['cor:lsd']}). Green dashed lines represent asymptotic positions of spikes $\mu^{(\ell)}_N + \sigma_N \tilde{\xi}^{(\ell)}_{q_\ell}$ (Theorem \ref{['thm:spike']}). Bottom: Observed alignments between the dominant eigenvectors of ${\bm{T}}^{(\ell)} {\bm{T}}^{(\ell) \top}$ and ${\bm{P}}^{(\ell)} {\bm{P}}^{(\ell) \top}$ (purple bars) with their predicted asymptotic values $[\zeta^{(\ell)}_{q_\ell}]^+$ (red curve, Theorem \ref{['thm:spike']}). Experimental setting:$d = 3$, $(n_1, n_2, n_3) = (300, 500, 700)$, $N = n_1 + n_2 + n_3$, $(r_1, r_2, r_3) = (3, 4, 5)$ and $\lVert {\bm{\mathscr{P}}} \rVert_{\mathrm{F}}^2 / \sigma_N = 15$.
Figure 4: Alignments between singular subspaces of the observation ${\bm{\mathscr{T}}} = {\bm{\mathscr{P}}} + \frac{1}{\sqrt{N}} {\bm{\mathscr{N}}}$ and of the signal ${\bm{\mathscr{P}}}$, with $\lVert {\bm{\mathscr{P}}} \rVert_{\mathrm{F}}^2 / \sigma_N = 10$, at initialization of Algorithm \ref{['alg:hooi']} (i.e., truncated MLSVD) and after the first iteration, as a function of the size of the tensor given by the parameter $N$. Left:$\frac{1}{r_\ell} \lVert {\bm{X}}^{(\ell) \top} {\bm{U}}^{(\ell)}_0 \rVert_{\mathrm{F}}^2$. Middle:$\frac{1}{r_\ell} \lVert {\bm{X}}^{(\ell) \top} {\bm{U}}^{(\ell)}_1 \rVert_{\mathrm{F}}^2$. Right:$(1 - \frac{1}{r_\ell} \lVert {\bm{X}}^{(\ell) \top} {\bm{U}}^{(\ell)}_1 \rVert_{\mathrm{F}}^2) \times \sqrt{\sigma_N}$. Experimental setting:$d = 3$, $(\frac{n_1}{N}, \frac{n_2}{N}, \frac{n_3}{N}) = (\frac{1}{6}, \frac{2}{6}, \frac{3}{6})$, $N = n_1 + n_2 + n_3$ and $(r_1, r_2, r_3) = (3, 4, 5)$.

Theorems & Definitions (18)

Remark 1: Uniqueness of the MLSVD up to isometries
Definition 2: Stieltjes transform
Definition 3: Deterministic equivalent
Lemma 4: stein_estimation_1981
Corollary 5: Limiting spectral distribution
Remark 6: From Marčenko-Pastur to Wigner
Remark 7: Confinement of the spectrum
Theorem 8: Spike behavior
Remark 9: Spiked Wigner model
Definition 10: Principal angles
...and 8 more

A Random Matrix Approach to Low-Multilinear-Rank Tensor Approximation

TL;DR

Abstract

A Random Matrix Approach to Low-Multilinear-Rank Tensor Approximation

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (4)

Theorems & Definitions (18)