Table of Contents
Fetching ...

Asymptotic Optimism for Tensor Regression Models with Applications to Neural Network Compression

Haoming Shi, Eric C. Chi, Hengrui Luo

Abstract

We study rank selection for low-rank tensor regression under random covariates design. Under a Gaussian random-design model and some mild conditions, we derive population expressions for the expected training-testing discrepancy (optimism) for both CP and Tucker decomposition. We further demonstrate that the optimism is minimized at the true tensor rank for both CP and Tucker regression. This yields a prediction-oriented rank-selection rule that aligns with cross-validation and extends naturally to tensor-model averaging. We also discuss conditions under which under- or over-ranked models may appear preferable, thereby clarifying the scope of the method. Finally, we showcase its practical utility on a real-world image regression task and extend its application to tensor-based compression of neural network, highlighting its potential for model selection in deep learning.

Asymptotic Optimism for Tensor Regression Models with Applications to Neural Network Compression

Abstract

We study rank selection for low-rank tensor regression under random covariates design. Under a Gaussian random-design model and some mild conditions, we derive population expressions for the expected training-testing discrepancy (optimism) for both CP and Tucker decomposition. We further demonstrate that the optimism is minimized at the true tensor rank for both CP and Tucker regression. This yields a prediction-oriented rank-selection rule that aligns with cross-validation and extends naturally to tensor-model averaging. We also discuss conditions under which under- or over-ranked models may appear preferable, thereby clarifying the scope of the method. Finally, we showcase its practical utility on a real-world image regression task and extend its application to tensor-based compression of neural network, highlighting its potential for model selection in deep learning.

Paper Structure

This paper contains 38 sections, 18 theorems, 218 equations, 11 figures, 4 tables.

Key Result

Lemma 3.1

(Linear Independence of Vectorized CP Components) Let $\boldsymbol{\mathscr{B}}\in\mathbb{R}^{I_{1} \times \cdots \times I_{M}}$ be a tensor with a rank-$R$ CP decomposition eq:CP_decomposition and $\{{\bm{\mathbf{v}}}_{r}\}_{r=1}^{R}$ denote its vectorized rank-1 components with ${\bm{\mathbf{v}}}_

Figures (11)

  • Figure 1: Average optimism of the tensor KRR model over $10000$ Monte Carlo replicates. The left panel varies the target CP rank and noise level, with the training sample size held constant at $n_{\text{train}} = 200$. The right panel varies the target CP rank and the training sample size, with the noise level held constant at $5\%$ of the signal standard deviation. In all cases, the regularization parameter is $\lambda = 1$. Results are shown for the oracle case, where the CP kernel is constructed from the true tensor coefficient $\boldsymbol{\mathscr{B}}$, which has a default rank of $3$.
  • Figure 1: Selected Ranks
  • Figure 2: Average optimism for low-rank CP and Tucker regression with varying ranks and sample size. The left panel shows the results for CP regression over $10000$ MC replicates. The noise level is fixed at $5\%$ of the signal standard deviation, and the true CP rank is $R = 3$. The right panel shows the results for Tucker regression (using TensorGP from yu2018tensor). The noise level is fixed at $1\%$ of the signal standard deviation to maintain a similar noise magnitude as the CP case, and the true Tucker rank is $R = (3,3,3)$. Here we reduce the MC replicates to $100$ due to its high computational complexity and our fixed computational power. In both plots, the model rank varies along the x-axis, while different colors correspond to varying training sample sizes.
  • Figure 2: Selected Ranks
  • Figure 3: Selected Ranks
  • ...and 6 more figures

Theorems & Definitions (27)

  • Lemma 3.1
  • Lemma 3.2
  • Theorem 3.1
  • Theorem 3.2
  • Theorem 3.3
  • Proposition 3.1
  • Remark 3.1
  • Theorem 3.4
  • Remark 4.1
  • Theorem 4.1
  • ...and 17 more