Table of Contents
Fetching ...

Tensor cumulants for statistical inference on invariant distributions

Dmitriy Kunisky, Cristopher Moore, Alexander S. Wein

TL;DR

The paper develops a tensor-generalization of finite free cumulants to provide an explicit near-orthogonal basis for invariant polynomials over tensors, unifying and strengthening low-degree hardness results for invariant problems such as Tensor PCA. Using tensor networks, open-graph moments, and Weingarten calculus, it derives exact additivity properties and a graph-moment expansion that yields sharp detection and reconstruction thresholds in the spiked tensor model, and sharp computational thresholds for distinguishing Wigner from Wishart tensor ensembles. The main technical advance is the tensorial finite free cumulant framework, which ties invariant statistics to additive free convolution and provides a tractable basis for analyzing low-degree algorithms. These tools yield new insights into the computational-statistical gap and establish subexponential-time regimes where low-degree methods remain informative. Collectively, the results illuminate how invariance and cumulant structure govern the feasibility and limits of polynomial-time inference on high-dimensional, symmetric tensor data with orthogonally invariant distributions.

Abstract

Many problems in high-dimensional statistics appear to have a statistical-computational gap: a range of values of the signal-to-noise ratio where inference is information-theoretically possible, but (conjecturally) computationally intractable. A canonical such problem is Tensor PCA, where we observe a tensor $Y$ consisting of a rank-one signal plus Gaussian noise. Multiple lines of work suggest that Tensor PCA becomes computationally hard at a critical value of the signal's magnitude. In particular, below this transition, no low-degree polynomial algorithm can detect the signal with high probability; conversely, various spectral algorithms are known to succeed above this transition. We unify and extend this work by considering tensor networks, orthogonally invariant polynomials where multiple copies of $Y$ are "contracted" to produce scalars, vectors, matrices, or other tensors. We define a new set of objects, tensor cumulants, which provide an explicit, near-orthogonal basis for invariant polynomials of a given degree. This basis lets us unify and strengthen previous results on low-degree hardness, giving a combinatorial explanation of the hardness transition and of a continuum of subexponential-time algorithms that work below it, and proving tight lower bounds against low-degree polynomials for recovering rather than just detecting the signal. It also lets us analyze a new problem of distinguishing between different tensor ensembles, such as Wigner and Wishart tensors, establishing a sharp computational threshold and giving evidence of a new statistical-computational gap in the Central Limit Theorem for random tensors. Finally, we believe these cumulants are valuable mathematical objects in their own right: they generalize the free cumulants of free probability theory from matrices to tensors, and share many of their properties, including additivity under additive free convolution.

Tensor cumulants for statistical inference on invariant distributions

TL;DR

The paper develops a tensor-generalization of finite free cumulants to provide an explicit near-orthogonal basis for invariant polynomials over tensors, unifying and strengthening low-degree hardness results for invariant problems such as Tensor PCA. Using tensor networks, open-graph moments, and Weingarten calculus, it derives exact additivity properties and a graph-moment expansion that yields sharp detection and reconstruction thresholds in the spiked tensor model, and sharp computational thresholds for distinguishing Wigner from Wishart tensor ensembles. The main technical advance is the tensorial finite free cumulant framework, which ties invariant statistics to additive free convolution and provides a tractable basis for analyzing low-degree algorithms. These tools yield new insights into the computational-statistical gap and establish subexponential-time regimes where low-degree methods remain informative. Collectively, the results illuminate how invariance and cumulant structure govern the feasibility and limits of polynomial-time inference on high-dimensional, symmetric tensor data with orthogonally invariant distributions.

Abstract

Many problems in high-dimensional statistics appear to have a statistical-computational gap: a range of values of the signal-to-noise ratio where inference is information-theoretically possible, but (conjecturally) computationally intractable. A canonical such problem is Tensor PCA, where we observe a tensor consisting of a rank-one signal plus Gaussian noise. Multiple lines of work suggest that Tensor PCA becomes computationally hard at a critical value of the signal's magnitude. In particular, below this transition, no low-degree polynomial algorithm can detect the signal with high probability; conversely, various spectral algorithms are known to succeed above this transition. We unify and extend this work by considering tensor networks, orthogonally invariant polynomials where multiple copies of are "contracted" to produce scalars, vectors, matrices, or other tensors. We define a new set of objects, tensor cumulants, which provide an explicit, near-orthogonal basis for invariant polynomials of a given degree. This basis lets us unify and strengthen previous results on low-degree hardness, giving a combinatorial explanation of the hardness transition and of a continuum of subexponential-time algorithms that work below it, and proving tight lower bounds against low-degree polynomials for recovering rather than just detecting the signal. It also lets us analyze a new problem of distinguishing between different tensor ensembles, such as Wigner and Wishart tensors, establishing a sharp computational threshold and giving evidence of a new statistical-computational gap in the Central Limit Theorem for random tensors. Finally, we believe these cumulants are valuable mathematical objects in their own right: they generalize the free cumulants of free probability theory from matrices to tensors, and share many of their properties, including additivity under additive free convolution.
Paper Structure (36 sections, 71 theorems, 180 equations, 12 figures)

This paper contains 36 sections, 71 theorems, 180 equations, 12 figures.

Key Result

Theorem 1.6

Let $D = D(n) \in \mathbb{N}$ have $D \leq \sqrt{n / 2p^2}$. There are constants $a_p, b_p > 0$ such that:

Figures (12)

  • Figure 1: The graph moment $m_G(T)$ where $T$ is a symmetric 3-ary tensor and $G=K_4$. Summing over all six indices, one on each edge, contracts the graph and yields the scalar \ref{['eq:k4']}.
  • Figure 2: On the left, a tensor network defining a 4-index tensor from two copies of a 3-ary tensor; we sum over all $n$ values of the internal index $z$. It can be viewed, among other things, as an $n^2$-dimensional matrix $M_{(i,j),(k,\ell)}$. This matrix is used in a spectral algorithm for tensor PCA in pmlr-v40-Hopkins15. On the right, a partial trace where we also sum over $i$, giving an $n$-dimensional matrix $M_{j\ell}$ used by HSSS for another spectral algorithm.
  • Figure 3: Three examples of mixed moments of tensors. Here $U$, $V$, and $W$ have arity 1, 2, and 3 respectively. From left to right, these yield the bilinear form $\langle U,VU \rangle$, the trilinear form $\langle W, U^{\otimes 3}\rangle$, and the quantity in \ref{['eq:mixed-example']}.
  • Figure 4: Three open multigraphs whose partial contractions $m_G(T)$ yield vectors or matrices, i.e., 1-ary or 2-ary tensors. The first is the vector $m_G(T)_i = \sum_j T_{ijj}$. The third is the matrix $m_G(T)_{ii'}$ given by \ref{['eq:open-k4']}.
  • Figure 5: On the left, conjugating a 3-ary tensor $T$ by an orthogonal matrix $Q$. Each index of $T$ undergoes the same orthogonal basis change. On the right, for a 2-ary tensor, i.e., a matrix, this coincides with the usual notion of conjugation $Q^\top T Q$. Arrows indicate that $T$'s indices are contracted with the left index $i$ of $Q_{ij}$. Reversing an arrow converts $Q$ to its transpose $Q^\top$.
  • ...and 7 more figures

Theorems & Definitions (168)

  • Definition 1.1: Change of basis
  • Definition 1.2: Invariance
  • Definition 1.3: Graph moments
  • Example 1.4
  • Definition 1.5
  • Theorem 1.6: Tensor PCA detection; informal; Theorem 3.3 of KWB-2022-LowDegreeNotes
  • Theorem 1.7: Lower bound for tensor PCA reconstruction; informal
  • Definition 1.8
  • Theorem 1.9: Lower bound for Wigner vs. Wishart detection; informal
  • Theorem 1.10: Upper bound for Wigner vs. Wishart detection; informal
  • ...and 158 more