Table of Contents
Fetching ...

Fixed-sparsity matrix approximation from matrix-vector products

Noah Amsel, Tyler Chen, Feyza Duman Keles, Diana Halikias, Cameron Musco, Christopher Musco

TL;DR

The matrix-vector product query complexity of the problem up to constant factors is resolved, even for the well-studied case of diagonal approximation, for which no previous lower bounds were known.

Abstract

We study the problem of approximating a matrix $\mathbf{A}$ with a matrix that has a fixed sparsity pattern (e.g., diagonal, banded, etc.), when $\mathbf{A}$ is accessed only by matrix-vector products. We describe a simple randomized algorithm that returns an approximation with the given sparsity pattern with Frobenius-norm error at most $(1+\varepsilon)$ times the best possible error. When each row of the desired sparsity pattern has at most $s$ nonzero entries, this algorithm requires $O(s/\varepsilon)$ non-adaptive matrix-vector products with $\mathbf{A}$. We also prove a matching lower-bound, showing that, for any sparsity pattern with $Θ(s)$ nonzeros per row and column, any algorithm achieving $(1+ε)$ approximation requires $Ω(s/\varepsilon)$ matrix-vector products in the worst case. We thus resolve the matrix-vector product query complexity of the problem up to constant factors, even for the well-studied case of diagonal approximation, for which no previous lower bounds were known.

Fixed-sparsity matrix approximation from matrix-vector products

TL;DR

The matrix-vector product query complexity of the problem up to constant factors is resolved, even for the well-studied case of diagonal approximation, for which no previous lower bounds were known.

Abstract

We study the problem of approximating a matrix with a matrix that has a fixed sparsity pattern (e.g., diagonal, banded, etc.), when is accessed only by matrix-vector products. We describe a simple randomized algorithm that returns an approximation with the given sparsity pattern with Frobenius-norm error at most times the best possible error. When each row of the desired sparsity pattern has at most nonzero entries, this algorithm requires non-adaptive matrix-vector products with . We also prove a matching lower-bound, showing that, for any sparsity pattern with nonzeros per row and column, any algorithm achieving approximation requires matrix-vector products in the worst case. We thus resolve the matrix-vector product query complexity of the problem up to constant factors, even for the well-studied case of diagonal approximation, for which no previous lower bounds were known.
Paper Structure (26 sections, 17 theorems, 98 equations, 4 figures, 2 algorithms)

This paper contains 26 sections, 17 theorems, 98 equations, 4 figures, 2 algorithms.

Key Result

Theorem 1

Consider any $\mathbf{A} \in\mathbb{R}^{n\times d}$ and any $\mathbf{S}\in\{0,1\}^{n\times d}$ with at most $s$ nonzero entries per row. Then, for any $m\geq s+2$, using $m$ randomized matrix-vector queries, alg:main returns a matrix $\widetilde{\mathbf{A}}$, equal to $\mathbf{S}\circ\mathbf{A}$ in The above inequality is equality if each row of $\mathbf{S}$ has exactly $s$ non-zero entries.

Figures (4)

  • Figure 1: Left: Visualization of a matrix describedin \ref{['sec:coloring_better']} for which \ref{['alg:main']} is not the best method for recovering the diagonal (intensity indicates magnitude of entries of $\mathbf{A}$). In particular, the diagonal of the matrix can be recovered using exactly 2 queries, while \ref{['alg:main']} will require many queries to overcome the large noise in the off-diagonal blocks. Middle: Visualization of a matrix for which using the same colorings as the matrix on the left panel will not help. Right: Visualization of the hard sparsity pattern described in \ref{['sec:hard-coloring']} with $k=10$. Here black pixels correspond to one and white pixels to zero. Note that while each row and column of the matrix has only $O(k)$ nonzeros, each pair of the $k^2$ columns has overlapping support.
  • Figure 2: Approximation of model problem matrix $\mathbf{A} = \operatorname{tridiag}(-1,4,-1)^{-1}$ by a matrix of total bandwidth $s$ for varying values of $s$. The solid circles indicate the root mean squared error of \ref{['alg:main']} over 20 independent runs of the algorithm, and the shaded region indicates the 10%-90% range. The dotted lines are the $\sqrt{s/(m-s-1)}\|\mathbf{A} -\mathbf{S}\circ\widetilde{\mathbf{A}}\|_\mathsf{F}$ (left) and $\sqrt{1+s/(m-s-1)}\|\mathbf{A} -\mathbf{S}\circ\widetilde{\mathbf{A}}\|_\mathsf{F}$ (right).
  • Figure 3: Left: Log-scale of the nonzero entries of $\mathbf{M}$, which range in magnitude from 1 to 7919. Middle: Log-scale of the nonzero entries of $\mathbf{A}$. Right: Sample sparsity pattern $\mathbf{S}$ corresponding to $b=5$.
  • Figure 4: Approximation of "Trefethen primes" inverse matrix by a multi-banded matrix for varying values of $s$. The solid circles indicate the root mean squared error of \ref{['alg:main']} over 100 independent runs of the algorithm, and the shaded region indicates the 10%-90% range. The dotted lines are the $\sqrt{s/(m-s-1)}\|\mathbf{A} -\mathbf{S}\circ\widetilde{\mathbf{A}}\|_\mathsf{F}$ (left) and $\sqrt{1+s/(m-s-1)}\|\mathbf{A} -\mathbf{S}\circ\widetilde{\mathbf{A}}\|_\mathsf{F}$ (right).

Theorems & Definitions (36)

  • Theorem 1
  • Corollary 1
  • Theorem 2
  • Proposition 1
  • proof
  • Proposition 2
  • proof
  • Theorem 2
  • Corollary 1
  • proof : Proof of \ref{['thm:ub_main']}
  • ...and 26 more