The Change-of-Measure Method, Block Lewis Weights, and Approximating Matrix Block Norms

Naren Sarayu Manoj; Max Ovsiankin

The Change-of-Measure Method, Block Lewis Weights, and Approximating Matrix Block Norms

Naren Sarayu Manoj, Max Ovsiankin

Abstract

Given a matrix $\mathbf{A} \in \mathbb{R}^{k \times n}$, a partitioning of $[k]$ into groups $S_1,\dots,S_m$, an outer norm $p$, and a collection of inner norms such that either $p \ge 1$ and $p_1,\dots,p_m \ge 2$ or $p_1=\dots=p_m=p \ge 1/\log n$, we prove that there is a sparse weight vector $\mathbfβ \in \mathbb{R}^{m}$ such that $\sum_{i=1}^m \mathbfβ_i \cdot \|\mathbf{A}_{S_i}\mathbf{x}\|_{p_i}^p \approx_{1\pm\varepsilon} \sum_{i=1}^m \|\mathbf{A}_{S_i}\mathbf{x}\|_{p_i}^p$, where the number of nonzero entries of $\mathbfβ$ is at most $O_{p,p_i}(\varepsilon^{-2}n^{\max(1,p/2)}(\log n)^2(\log(n/\varepsilon)))$. When $p_1\dots,p_m \ge 2$, this weight vector arises from an importance sampling procedure based on the \textit{block Lewis weights}, a recently proposed generalization of Lewis weights. Additionally, we prove that there exist efficient algorithms to find the sparse weight vector $\mathbfβ$ in several important regimes of $p$ and $p_1,\dots,p_m$. Our results imply a $\widetilde{O}(\varepsilon^{-1}\sqrt{n})$-linear system solve iteration complexity for the problem of minimizing sums of Euclidean norms, improving over the previously known $\widetilde{O}(\sqrt{m}\log({1/\varepsilon}))$ iteration complexity when $m \gg n$. Our main technical contribution is a substantial generalization of the \textit{change-of-measure} method that Bourgain, Lindenstrauss, and Milman used to obtain the analogous result when every group has size $1$. Our generalization allows one to analyze change of measures beyond those implied by D. Lewis's original construction, including the measure implied by the block Lewis weights and natural approximations of this measure.

The Change-of-Measure Method, Block Lewis Weights, and Approximating Matrix Block Norms

Abstract

Given a matrix

, a partitioning of

into groups

, an outer norm

, and a collection of inner norms such that either

and

, we prove that there is a sparse weight vector

such that

, where the number of nonzero entries of

is at most

. When

, this weight vector arises from an importance sampling procedure based on the \textit{block Lewis weights}, a recently proposed generalization of Lewis weights. Additionally, we prove that there exist efficient algorithms to find the sparse weight vector

in several important regimes of

and

. Our results imply a

-linear system solve iteration complexity for the problem of minimizing sums of Euclidean norms, improving over the previously known

iteration complexity when

. Our main technical contribution is a substantial generalization of the \textit{change-of-measure} method that Bourgain, Lindenstrauss, and Milman used to obtain the analogous result when every group has size

. Our generalization allows one to analyze change of measures beyond those implied by D. Lewis's original construction, including the measure implied by the block Lewis weights and natural approximations of this measure.

Paper Structure (43 sections, 45 theorems, 250 equations, 1 table, 3 algorithms)

This paper contains 43 sections, 45 theorems, 250 equations, 1 table, 3 algorithms.

Introduction
Our results
Computing sampling probabilities.
Applications to minimizing sums of Euclidean norms.
Outline.
Notation and definitions
General notation.
Linear algebra notation.
Technical overview
Concentration
Covering numbers
The change-of-measure principle and norm interpolation
Handling general $S_i$.
Change-of-measures in functional analysis.
Prior results, related works, and connections
...and 28 more sections

Key Result

Theorem 1

Let $\mathcal{G} = (\mathbf{A} \in \mathbb{R}^{k \times n}, S_1,\dots,S_m, p_1,\dots,p_m)$ where $S_1,\dots,S_m$ form a partition of $[k]$. Suppose at least one of the following holds: Let $P \coloneqq \max\left(1, \max_{i \in [m]} \min(p_i,\log\left\lvert S_i \right\rvert)\right)$. Then, there exists a probability distribution $\mathcal{D} = \left(\bm{\rho}_1,\dots,\bm{\rho}_m\right)$ such that

Theorems & Definitions (100)

Theorem 1: Block Lewis weight sampling
Theorem 2: Computation of block Lewis weights
Theorem 3: Minimizing sums of Euclidean norms
Definition 1.1: Sampling body
Definition 1.1: Block Lewis overestimate
Definition 2.1
proof : Proof of \ref{['fact:whitening']}
Definition 2.4: Covering numbers rothvoss
Definition 2.5: Entropy numbers vh15
Definition 2.7: $\left\lVert \cdot \right\rVert_{\psi_2}$ and subgaussian random variable vershynin_2018
...and 90 more

The Change-of-Measure Method, Block Lewis Weights, and Approximating Matrix Block Norms

Abstract

The Change-of-Measure Method, Block Lewis Weights, and Approximating Matrix Block Norms

Authors

Abstract

Table of Contents

Key Result

Theorems & Definitions (100)