Table of Contents
Fetching ...

Statistical and Computational Efficiency for Smooth Tensor Estimation with Unknown Permutations

Chanwoo Lee, Miaoyan Wang

TL;DR

This work develops a permuted smooth tensor framework to denoise high-order tensors with an unknown latent permutation, linking Hölder-smooth function representation to block-wise polynomial approximations on a fixed grid. It reveals a phase transition in recovery governed by the smoothness parameter and tensor order, establishing minimax rates that separate nonparametric and permutation-driven complexity, and showing a sufficiency threshold for polynomial degree. The authors present a statistically optimal yet computationally intensive least-squares estimator, and a scalable Borda-count algorithm that achieves the minimax rate under a monotonicity assumption, with theoretical guarantees and empirical validation on synthetic data and Chicago crime data. The results illuminate fundamental statistical-computational gaps in permuted tensor estimation and provide practical tools, including a public R package, for structured tensor denoising in diverse applications.

Abstract

We consider the problem of structured tensor denoising in the presence of unknown permutations. Such data problems arise commonly in recommendation system, neuroimaging, community detection, and multiway comparison applications. Here, we develop a general family of smooth tensor models up to arbitrary index permutations; the model incorporates the popular tensor block models and Lipschitz hypergraphon models as special cases. We show that a constrained least-squares estimator in the block-wise polynomial family achieves the minimax error bound. A phase transition phenomenon is revealed with respect to the smoothness threshold needed for optimal recovery. In particular, we find that a polynomial of degree up to $(m-2)(m+1)/2$ is sufficient for accurate recovery of order-$m$ tensors, whereas higher degree exhibits no further benefits. This phenomenon reveals the intrinsic distinction for smooth tensor estimation problems with and without unknown permutations. Furthermore, we provide an efficient polynomial-time Borda count algorithm that provably achieves optimal rate under monotonicity assumptions. The efficacy of our procedure is demonstrated through both simulations and Chicago crime data analysis.

Statistical and Computational Efficiency for Smooth Tensor Estimation with Unknown Permutations

TL;DR

This work develops a permuted smooth tensor framework to denoise high-order tensors with an unknown latent permutation, linking Hölder-smooth function representation to block-wise polynomial approximations on a fixed grid. It reveals a phase transition in recovery governed by the smoothness parameter and tensor order, establishing minimax rates that separate nonparametric and permutation-driven complexity, and showing a sufficiency threshold for polynomial degree. The authors present a statistically optimal yet computationally intensive least-squares estimator, and a scalable Borda-count algorithm that achieves the minimax rate under a monotonicity assumption, with theoretical guarantees and empirical validation on synthetic data and Chicago crime data. The results illuminate fundamental statistical-computational gaps in permuted tensor estimation and provide practical tools, including a public R package, for structured tensor denoising in diverse applications.

Abstract

We consider the problem of structured tensor denoising in the presence of unknown permutations. Such data problems arise commonly in recommendation system, neuroimaging, community detection, and multiway comparison applications. Here, we develop a general family of smooth tensor models up to arbitrary index permutations; the model incorporates the popular tensor block models and Lipschitz hypergraphon models as special cases. We show that a constrained least-squares estimator in the block-wise polynomial family achieves the minimax error bound. A phase transition phenomenon is revealed with respect to the smoothness threshold needed for optimal recovery. In particular, we find that a polynomial of degree up to is sufficient for accurate recovery of order- tensors, whereas higher degree exhibits no further benefits. This phenomenon reveals the intrinsic distinction for smooth tensor estimation problems with and without unknown permutations. Furthermore, we provide an efficient polynomial-time Borda count algorithm that provably achieves optimal rate under monotonicity assumptions. The efficacy of our procedure is demonstrated through both simulations and Chicago crime data analysis.

Paper Structure

This paper contains 45 sections, 19 theorems, 135 equations, 10 figures, 6 tables.

Key Result

Proposition 1

Suppose $\Theta\in\mathcal{P}(\alpha,L)$. Then, for every block number $k\leq d$ and degree $\ell\in \mathbb{N}$, we have the approximation error

Figures (10)

  • Figure 1: (a): Illustration of order-$m$$d$-dimensional permuted smooth tensor models with $m=2$. (b): Phase transition of the mean squared error (MSE) (on the $-\log_d$ scale) as a function of smoothness $\alpha$ and tensor order $m$. Bold dots correspond to the critical smoothness level above which higher smoothness exhibits no further benefits for tensor estimation.
  • Figure 3: MSE versus the number of blocks based on different polynomial approximations. Columns 1-3 consider the Models 1, 3, and 5 respectively. Panel (a) is for continuous tensors, whereas (b) is for the binary tensors.
  • Figure 4: MSE versus the tensor dimension based on different estimation methods. Columns 1-3 consider the Models 1, 3, and 5 in Table \ref{['tb:md']} respectively. Panel (a) is for continuous tensors, whereas (b) is for the binary tensors.
  • Figure 5: Performance comparison among different methods. The observed data tensors, true signal tensors, and estimated signal tensors are plotted for Models 1, 3 and 5 in Table \ref{['tb:md']} with fixed dimension $d = 80$. Numbers in parenthesis indicate the mean squared error.
  • Figure 6: Chicago crime maps. Figure(a) is the benchmark map based on homicides and shooting incidents in community areas in Chicago Jeremy.2020. Figure(b) shows the four clustered areas learned from 32 crime types using our method.
  • ...and 5 more figures

Theorems & Definitions (46)

  • Definition 1: $\alpha$-Hölder smooth
  • Example 1: Co-authorship networks
  • Example 2: Gaussian tensor block model
  • Proposition 1: Block-wise polynomial tensor approximation
  • Theorem 1: Least-squares estimation error
  • Remark 1: Comparison to non-parametric regression
  • Remark 2: Breaking previous limits on matrices/tensors
  • Theorem 2: Minimax lower bound
  • Remark 3: Phase transition
  • Remark 4: Extension to random designs
  • ...and 36 more