Table of Contents
Fetching ...

Matrix-Free Two-to-Infinity and One-to-Two Norms Estimation

Askar Tsyganov, Evgeny Frolov, Sergey Samsonov, Maxim Rakhuba

TL;DR

The paper targets matrix-free estimation of the induced two-to-infinity and one-to-two norms using only matrix-vector products. It introduces TwINEst, a Hutchinson-based diagonal-estimation approach that extracts the $\ell_2$-row norms from $AA^\top$ by estimating its diagonal and selecting the maximal entry, with an oracle complexity that scales with the gap $\Delta$ between the top and next rows. An enhanced variant, TwINEst++, combines a low-rank approximation of $AA^\top$ with a stochastic diagonal estimate to reduce variance and improve robustness, achieving tighter complexity bounds especially when $\Delta$ is small. The algorithms are validated on synthetic data, neural-network Jacobians, and recommender-system problems, demonstrating improved accuracy and practical benefits for Jacobian regularization and adversarial robustness. The work highlights the utility of matrix-free norm estimation in large-scale ML settings and opens avenues for further theoretical and application-driven advances.

Abstract

In this paper, we propose new randomized algorithms for estimating the two-to-infinity and one-to-two norms in a matrix-free setting, using only matrix-vector multiplications. Our methods are based on appropriate modifications of Hutchinson's diagonal estimator and its Hutch++ version. We provide oracle complexity bounds for both modifications. We further illustrate the practical utility of our algorithms for Jacobian-based regularization in deep neural network training on image classification tasks. We also demonstrate that our methodology can be applied to mitigate the effect of adversarial attacks in the domain of recommender systems.

Matrix-Free Two-to-Infinity and One-to-Two Norms Estimation

TL;DR

The paper targets matrix-free estimation of the induced two-to-infinity and one-to-two norms using only matrix-vector products. It introduces TwINEst, a Hutchinson-based diagonal-estimation approach that extracts the -row norms from by estimating its diagonal and selecting the maximal entry, with an oracle complexity that scales with the gap between the top and next rows. An enhanced variant, TwINEst++, combines a low-rank approximation of with a stochastic diagonal estimate to reduce variance and improve robustness, achieving tighter complexity bounds especially when is small. The algorithms are validated on synthetic data, neural-network Jacobians, and recommender-system problems, demonstrating improved accuracy and practical benefits for Jacobian regularization and adversarial robustness. The work highlights the utility of matrix-free norm estimation in large-scale ML settings and opens avenues for further theoretical and application-driven advances.

Abstract

In this paper, we propose new randomized algorithms for estimating the two-to-infinity and one-to-two norms in a matrix-free setting, using only matrix-vector multiplications. Our methods are based on appropriate modifications of Hutchinson's diagonal estimator and its Hutch++ version. We provide oracle complexity bounds for both modifications. We further illustrate the practical utility of our algorithms for Jacobian-based regularization in deep neural network training on image classification tasks. We also demonstrate that our methodology can be applied to mitigate the effect of adversarial attacks in the domain of recommender systems.

Paper Structure

This paper contains 28 sections, 6 theorems, 47 equations, 8 figures, 4 tables, 3 algorithms.

Key Result

Theorem 1

Let $A \in \mathbb{R}^{d \times d}$, $m \in \mathbb{N}$, $\delta \in (0, 1]$. Then with probability at least $1 - \delta$: where $c$ is an absolute constant.

Figures (8)

  • Figure 1: Comparison of methods for estimating the two-to-infinity norm on random square matrices. Shown is the relative error versus the number of matrix-vector multiplications, averaged over 500 trials.
  • Figure 2: Comparison of methods for estimating the two-to-infinity norm of the Jacobian matrix of WideResNet-16-10 trained on CIFAR-100. The plot shows the relative error versus the number of matrix-vector multiplications, averaged over 500 trials.
  • Figure 3: Comparison of different regularization methods for adversarial robustness of UltraGCN mao2021ultragcn. Shown is the NDCG@10 metric(higher is better) versus the magnitude of the attack. Metric is averaged over 5 trials.
  • Figure 4: Singular values of synthetic and real world matrices.
  • Figure 5: Comparison of different strategies for choosing $r$ in TwINEst++ algorithm. The plot shows the relative error versus number of matrix-vector multiplications, averaged over 500 trials.
  • ...and 3 more figures

Theorems & Definitions (17)

  • Example 1: Divergence of the Adaptive Power Method
  • Definition 1
  • Theorem 1: Theorem 1 in dharangutte2023tight
  • Theorem 2: TwINEst Oracle Complexity
  • Remark 3: TwINEst Deviation Bound
  • proof
  • Theorem 4: TwINEst++ Oracle Complexity
  • proof
  • Lemma 5
  • proof
  • ...and 7 more