Tensor cumulants for statistical inference on invariant distributions

Dmitriy Kunisky; Cristopher Moore; Alexander S. Wein

Tensor cumulants for statistical inference on invariant distributions

Dmitriy Kunisky, Cristopher Moore, Alexander S. Wein

TL;DR

The paper develops a tensor-generalization of finite free cumulants to provide an explicit near-orthogonal basis for invariant polynomials over tensors, unifying and strengthening low-degree hardness results for invariant problems such as Tensor PCA. Using tensor networks, open-graph moments, and Weingarten calculus, it derives exact additivity properties and a graph-moment expansion that yields sharp detection and reconstruction thresholds in the spiked tensor model, and sharp computational thresholds for distinguishing Wigner from Wishart tensor ensembles. The main technical advance is the tensorial finite free cumulant framework, which ties invariant statistics to additive free convolution and provides a tractable basis for analyzing low-degree algorithms. These tools yield new insights into the computational-statistical gap and establish subexponential-time regimes where low-degree methods remain informative. Collectively, the results illuminate how invariance and cumulant structure govern the feasibility and limits of polynomial-time inference on high-dimensional, symmetric tensor data with orthogonally invariant distributions.

Abstract

Many problems in high-dimensional statistics appear to have a statistical-computational gap: a range of values of the signal-to-noise ratio where inference is information-theoretically possible, but (conjecturally) computationally intractable. A canonical such problem is Tensor PCA, where we observe a tensor $Y$ consisting of a rank-one signal plus Gaussian noise. Multiple lines of work suggest that Tensor PCA becomes computationally hard at a critical value of the signal's magnitude. In particular, below this transition, no low-degree polynomial algorithm can detect the signal with high probability; conversely, various spectral algorithms are known to succeed above this transition. We unify and extend this work by considering tensor networks, orthogonally invariant polynomials where multiple copies of $Y$ are "contracted" to produce scalars, vectors, matrices, or other tensors. We define a new set of objects, tensor cumulants, which provide an explicit, near-orthogonal basis for invariant polynomials of a given degree. This basis lets us unify and strengthen previous results on low-degree hardness, giving a combinatorial explanation of the hardness transition and of a continuum of subexponential-time algorithms that work below it, and proving tight lower bounds against low-degree polynomials for recovering rather than just detecting the signal. It also lets us analyze a new problem of distinguishing between different tensor ensembles, such as Wigner and Wishart tensors, establishing a sharp computational threshold and giving evidence of a new statistical-computational gap in the Central Limit Theorem for random tensors. Finally, we believe these cumulants are valuable mathematical objects in their own right: they generalize the free cumulants of free probability theory from matrices to tensors, and share many of their properties, including additivity under additive free convolution.

Tensor cumulants for statistical inference on invariant distributions

TL;DR

Abstract

consisting of a rank-one signal plus Gaussian noise. Multiple lines of work suggest that Tensor PCA becomes computationally hard at a critical value of the signal's magnitude. In particular, below this transition, no low-degree polynomial algorithm can detect the signal with high probability; conversely, various spectral algorithms are known to succeed above this transition. We unify and extend this work by considering tensor networks, orthogonally invariant polynomials where multiple copies of

are "contracted" to produce scalars, vectors, matrices, or other tensors. We define a new set of objects, tensor cumulants, which provide an explicit, near-orthogonal basis for invariant polynomials of a given degree. This basis lets us unify and strengthen previous results on low-degree hardness, giving a combinatorial explanation of the hardness transition and of a continuum of subexponential-time algorithms that work below it, and proving tight lower bounds against low-degree polynomials for recovering rather than just detecting the signal. It also lets us analyze a new problem of distinguishing between different tensor ensembles, such as Wigner and Wishart tensors, establishing a sharp computational threshold and giving evidence of a new statistical-computational gap in the Central Limit Theorem for random tensors. Finally, we believe these cumulants are valuable mathematical objects in their own right: they generalize the free cumulants of free probability theory from matrices to tensors, and share many of their properties, including additivity under additive free convolution.

Paper Structure (36 sections, 71 theorems, 180 equations, 12 figures)

This paper contains 36 sections, 71 theorems, 180 equations, 12 figures.

Introduction
Main Results
Tensor PCA
Computational central limit theorems and Wigner vs. Wishart
Main Proof Technique: Tensorial Finite Free Cumulants
Related Work
Eigenvalues of tensors
Random tensor theory
Tensor PCA
Algorithms using tensor networks and distinct indices
Cumulants in low-degree lower bounds
Notation
Preliminaries
Closed Tensor Networks as Invariant Polynomials
Open Tensor Networks as Equivariant Polynomials
...and 21 more sections

Key Result

Theorem 1.6

Let $D = D(n) \in \mathbb{N}$ have $D \leq \sqrt{n / 2p^2}$. There are constants $a_p, b_p > 0$ such that:

Figures (12)

Figure 1: The graph moment $m_G(T)$ where $T$ is a symmetric 3-ary tensor and $G=K_4$. Summing over all six indices, one on each edge, contracts the graph and yields the scalar \ref{['eq:k4']}.
Figure 2: On the left, a tensor network defining a 4-index tensor from two copies of a 3-ary tensor; we sum over all $n$ values of the internal index $z$. It can be viewed, among other things, as an $n^2$-dimensional matrix $M_{(i,j),(k,\ell)}$. This matrix is used in a spectral algorithm for tensor PCA in pmlr-v40-Hopkins15. On the right, a partial trace where we also sum over $i$, giving an $n$-dimensional matrix $M_{j\ell}$ used by HSSS for another spectral algorithm.
Figure 3: Three examples of mixed moments of tensors. Here $U$, $V$, and $W$ have arity 1, 2, and 3 respectively. From left to right, these yield the bilinear form $\langle U,VU \rangle$, the trilinear form $\langle W, U^{\otimes 3}\rangle$, and the quantity in \ref{['eq:mixed-example']}.
Figure 4: Three open multigraphs whose partial contractions $m_G(T)$ yield vectors or matrices, i.e., 1-ary or 2-ary tensors. The first is the vector $m_G(T)_i = \sum_j T_{ijj}$. The third is the matrix $m_G(T)_{ii'}$ given by \ref{['eq:open-k4']}.
Figure 5: On the left, conjugating a 3-ary tensor $T$ by an orthogonal matrix $Q$. Each index of $T$ undergoes the same orthogonal basis change. On the right, for a 2-ary tensor, i.e., a matrix, this coincides with the usual notion of conjugation $Q^\top T Q$. Arrows indicate that $T$'s indices are contracted with the left index $i$ of $Q_{ij}$. Reversing an arrow converts $Q$ to its transpose $Q^\top$.
...and 7 more figures

Theorems & Definitions (168)

Definition 1.1: Change of basis
Definition 1.2: Invariance
Definition 1.3: Graph moments
Example 1.4
Definition 1.5
Theorem 1.6: Tensor PCA detection; informal; Theorem 3.3 of KWB-2022-LowDegreeNotes
Theorem 1.7: Lower bound for tensor PCA reconstruction; informal
Definition 1.8
Theorem 1.9: Lower bound for Wigner vs. Wishart detection; informal
Theorem 1.10: Upper bound for Wigner vs. Wishart detection; informal
...and 158 more

Tensor cumulants for statistical inference on invariant distributions

TL;DR

Abstract

Tensor cumulants for statistical inference on invariant distributions

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (12)

Theorems & Definitions (168)