Table of Contents
Fetching ...

Efficient Algorithms for Verifying Kruskal Rank in Sparse Linear Regression and Related Applications

Fengqin Zhou

TL;DR

This work tackles the problem of verifying Kruskal rank for matrices encountered in sparse linear regression, tensor decomposition, and latent-variable models. It introduces a unified framework that fuses randomized hashing with dynamic programming to achieve high-probability correctness across binary fields, general finite fields, and integer matrices, with runtimes close to established lower bounds. The key contributions are threefold: (i) a collision-based hashing algorithm for binary fields with runtime $O(dk \cdot n^{\lceil k/2 \rceil})$, (ii) a finite-field extension with $GF(q)$ producing $O(dk \cdot (n(q-1))^{\lceil k/2 \rceil})$, and (iii) an integer-matrix approach leveraging Cramer’s rule and the Leibniz formula to obtain $O(dk \cdot (nM)^{\lceil k/2 \rceil})$ along with a deterministic dimensionality-reduction method. Together, these methods form a robust toolkit for certifying identifiability conditions in tensor decompositions and for diagnosing noise-transition matrices in deep learning, offering practical, near-optimal performance guarantees without requiring empirical experiments. The framework thus advances both the theoretical understanding and practical verification of linear-dependence structures in diverse data-model settings.

Abstract

We present novel algorithmic techniques to efficiently verify the Kruskal rank of matrices that arise in sparse linear regression, tensor decomposition, and latent variable models. Our unified framework combines randomized hashing techniques with dynamic programming strategies, and is applicable in various settings, including binary fields, general finite fields, and integer matrices. In particular, our algorithms achieve a runtime of $\mathcal{O}\left(dk \cdot \left(nM\right)^{\lceil k / 2 \rceil}\right)$ while ensuring high-probability correctness. Our contributions include: A unified framework for verifying Kruskal rank across different algebraic settings; Rigorous runtime and high-probability guarantees that nearly match known lower bounds; Practical implications for identifiability in tensor decompositions and deep learning, particularly for the estimation of noise transition matrices.

Efficient Algorithms for Verifying Kruskal Rank in Sparse Linear Regression and Related Applications

TL;DR

This work tackles the problem of verifying Kruskal rank for matrices encountered in sparse linear regression, tensor decomposition, and latent-variable models. It introduces a unified framework that fuses randomized hashing with dynamic programming to achieve high-probability correctness across binary fields, general finite fields, and integer matrices, with runtimes close to established lower bounds. The key contributions are threefold: (i) a collision-based hashing algorithm for binary fields with runtime , (ii) a finite-field extension with producing , and (iii) an integer-matrix approach leveraging Cramer’s rule and the Leibniz formula to obtain along with a deterministic dimensionality-reduction method. Together, these methods form a robust toolkit for certifying identifiability conditions in tensor decompositions and for diagnosing noise-transition matrices in deep learning, offering practical, near-optimal performance guarantees without requiring empirical experiments. The framework thus advances both the theoretical understanding and practical verification of linear-dependence structures in diverse data-model settings.

Abstract

We present novel algorithmic techniques to efficiently verify the Kruskal rank of matrices that arise in sparse linear regression, tensor decomposition, and latent variable models. Our unified framework combines randomized hashing techniques with dynamic programming strategies, and is applicable in various settings, including binary fields, general finite fields, and integer matrices. In particular, our algorithms achieve a runtime of while ensuring high-probability correctness. Our contributions include: A unified framework for verifying Kruskal rank across different algebraic settings; Rigorous runtime and high-probability guarantees that nearly match known lower bounds; Practical implications for identifiability in tensor decompositions and deep learning, particularly for the estimation of noise transition matrices.

Paper Structure

This paper contains 13 sections, 18 theorems, 9 equations, 1 algorithm.

Key Result

Lemma 4.1

For every $\mathbf{x}\in\{0,1\}^n$ with $\|\mathbf{x}\|_0\le\frac{k}{2}$, there exists some bucket in $\mathcal{H}$ that contains $\mathbf{A}\mathbf{x}$.

Theorems & Definitions (32)

  • Lemma 4.1
  • proof
  • Lemma 4.2
  • proof
  • Lemma 4.3
  • proof
  • Lemma 4.4
  • proof
  • Lemma 4.5
  • Lemma 4.6
  • ...and 22 more