Ridge Leverage Score Sampling for $\ell_p$ Subspace Approximation

David P. Woodruff; Taisuke Yasuda

Ridge Leverage Score Sampling for $\ell_p$ Subspace Approximation

David P. Woodruff, Taisuke Yasuda

TL;DR

This work advances the theory and practice of ℓ_p subspace approximation by constructing strong coresets whose sizes are nearly optimal in k for all p ≠ 2, using a novel ridge-leverage-score sampling framework. It introduces root ridge leverage sampling, additive-multiplicative ℓ_p subspace embeddings, and strategic flattening, along with a reduction to low-rank matrix embeddings, to achieve tight, dimension-free guarantees with favorable ε dependencies: ~Ō(k/ε^{4/p}) for 1≤p<2 and ~Ō(k^{p/2}/ε^{p}) for p>2 (up to polylog factors). The paper also delivers nearly optimal online and streaming coresets, making strong coreset technology practical in dynamic data settings, and connects these coresets to entrywise ℓ_p low-rank approximation. Overall, the methods circumvent the limitations of prior representative-subspace approaches, enabling faster, scalable, and online-capable ℓ_p subspace approximation with strong guarantee properties.

Abstract

The $\ell_p$ subspace approximation problem is an NP-hard low rank approximation problem that generalizes the median hyperplane ($p = 1$), principal component analysis ($p = 2$), and center hyperplane problems ($p = \infty$). A popular approach to cope with the NP-hardness is to compute a strong coreset, which is a weighted subset of input points that simultaneously approximates the cost of every $k$-dimensional subspace, typically to $(1+ε)$ relative error for a small constant $ε$. We obtain an algorithm for constructing a strong coreset for $\ell_p$ subspace approximation of size $\tilde O(kε^{-4/p})$ for $p<2$ and $\tilde O(k^{p/2}ε^{-p})$ for $p>2$. This offers the following improvements over prior work: - We construct the first strong coresets with nearly optimal dependence on $k$ for all $p\neq 2$. In prior work, [SW18] constructed coresets of modified points with a similar dependence on $k$, while [HV20] constructed true coresets with polynomially worse dependence on $k$. - We recover or improve the best known $ε$ dependence for all $p$. In particular, for $p > 2$, the [SW18] coreset of modified points had a dependence of $ε^{-p^2/2}$ and the [HV20] coreset had a dependence of $ε^{-3p}$. Our algorithm is based on sampling by root ridge leverage scores, which admits fast algorithms, especially for sparse or structured matrices. Our analysis avoids the use of the representative subspace theorem [SW18], which is a critical component of all prior dimension-independent coresets for $\ell_p$ subspace approximation. Our techniques also lead to the first nearly optimal online strong coresets for $\ell_p$ subspace approximation with similar bounds as the offline setting, resolving a problem of [WY23]. All prior approaches lose $\mathrm{poly}(k)$ factors in this setting, even when allowed to modify the original points.

Ridge Leverage Score Sampling for $\ell_p$ Subspace Approximation

TL;DR

Abstract

The

subspace approximation problem is an NP-hard low rank approximation problem that generalizes the median hyperplane (

), principal component analysis (

), and center hyperplane problems (

). A popular approach to cope with the NP-hardness is to compute a strong coreset, which is a weighted subset of input points that simultaneously approximates the cost of every

-dimensional subspace, typically to

relative error for a small constant

. We obtain an algorithm for constructing a strong coreset for

subspace approximation of size

for

and

for

. This offers the following improvements over prior work: - We construct the first strong coresets with nearly optimal dependence on

for all

. In prior work, [SW18] constructed coresets of modified points with a similar dependence on

, while [HV20] constructed true coresets with polynomially worse dependence on

. - We recover or improve the best known

dependence for all

. In particular, for

, the [SW18] coreset of modified points had a dependence of

and the [HV20] coreset had a dependence of

. Our algorithm is based on sampling by root ridge leverage scores, which admits fast algorithms, especially for sparse or structured matrices. Our analysis avoids the use of the representative subspace theorem [SW18], which is a critical component of all prior dimension-independent coresets for

subspace approximation. Our techniques also lead to the first nearly optimal online strong coresets for

subspace approximation with similar bounds as the offline setting, resolving a problem of [WY23]. All prior approaches lose

factors in this setting, even when allowed to modify the original points.

Paper Structure (42 sections, 37 theorems, 174 equations, 2 tables)

This paper contains 42 sections, 37 theorems, 174 equations, 2 tables.

Introduction
The lp subspace approximation problem
Sparsification and coresets
Strong coresets for lp subspace approximation
Prior work
Our contributions
Nearly optimal online coresets for lp subspace approximation
Technical overview
Pitfalls in prior work
Ridge leverage scores
Reduction to embedding low rank matrices
Idea 1: additive-multiplicative lp subspace embeddings via root ridge leverage scores
Problems when bounding the additive error
Idea 2: splitting rows for sharper additive error bounds for p<2
Idea 3: Dvoretzky's theorem for sharper additive error bounds for p>2
...and 27 more sections

Key Result

Theorem 1.3

Let $1\leq p<2$. Let $\mathbf{A}\in\mathbb R^{n\times d}$. Then, there is an algorithm running in $\tilde{O}\lparen*\rparen{\mathop{\mathrm{\mathsf{nnz}}}\nolimits(\mathbf{A}) + d^\omega}$ time which, with probability at least $1-\delta$, constructs a strong coreset $\mathbf{S}$ of size satisfying Definition def:strong-coreset, that is,

Theorems & Definitions (69)

Definition 1.1: Strong coresets for $\ell_p$ subspace approximation
Theorem 1.3
Theorem 1.4
Definition 1.6: $\ell_p$ sampling matrix
Definition 1.7: Ridge leverage scores AM2015CMM2017
Remark 1.8
Definition 1.9
Theorem 2.1: Dvoretzky's theorem for $\ell_p$ norms FLM1977PVZ2017
Lemma 2.2
proof
...and 59 more

Ridge Leverage Score Sampling for $\ell_p$ Subspace Approximation

TL;DR

Abstract

Ridge Leverage Score Sampling for $\ell_p$ Subspace Approximation

Authors

TL;DR

Abstract

Table of Contents

Key Result

Theorems & Definitions (69)