Accurate Analysis of Sparse Random Projections

Maciej Skórski

Accurate Analysis of Sparse Random Projections

Maciej Skórski

TL;DR

This work analyzes sparse Johnson-Lindenstrauss transforms with a focus on sharp, sub-Poisson tail bounds for norm preservation under sparse projections. By decomposing the Gram-matrix error, bounding moments of 1-D dense projections, and applying Poisson majorization, the authors derive an explicit embedding size bound $m \ge \frac{4\log(2/\delta)}{\epsilon^2} \cdot h\left(\frac{25\epsilon}{p}\right)^{-1}$ (with $p \le 1/30$ and $\epsilon \le p\log(1/(2p))$) that matches known optimal dimensions in several regimes. The Bennet function $h(u)=\frac{(1+u)\log(1+u)-u}{u^2/2}$ governs the tail behavior, leading to a transparent Poisson-dominated explanation of the sparsity-distortion tradeoffs. The results yield practical sparse JL constructions with explicit constants, improving both theoretical understanding and applicability in high-dimensional data processing where fast, sparse projections are desirable.

Abstract

There has been recently a lot of research on sparse variants of random projections, faster adaptations of the state-of-the-art dimensionality reduction technique originally due to Johsnon and Lindenstrauss. Although the construction is very simple, its analyses are notoriously complicated. Meeting the demand for both simplicity and accuracy, this work establishes sharp sub-poissonian tail bounds for the distribution of sparse random projections. Compared to other works, this analysis provide superior numerical guarantees (exactly matching impossibility results) while being arguably less complicated (the technique resembles Bennet's Inequality and is of independent interest).

Accurate Analysis of Sparse Random Projections

TL;DR

(with

and

) that matches known optimal dimensions in several regimes. The Bennet function

governs the tail behavior, leading to a transparent Poisson-dominated explanation of the sparsity-distortion tradeoffs. The results yield practical sparse JL constructions with explicit constants, improving both theoretical understanding and applicability in high-dimensional data processing where fast, sparse projections are desirable.

Abstract

Paper Structure (10 sections, 2 theorems, 26 equations, 1 table, 1 algorithm)

This paper contains 10 sections, 2 theorems, 26 equations, 1 table, 1 algorithm.

Introduction
Contribution
Comparison to Related Works
Preliminaries
Proof of \ref{['thm:main']}
Step 1: Decomposition over Embedding Dimension
Step 2: Moments of 1-Dim Dense Random Projections
Step 3: Poisson Majorization
Aggregating Bounds
Chernoff Bounds for Sub-Poisson Tails

Key Result

Theorem 1

For any confidence $\delta >0$, distortion $\epsilon>0$ and integer sparsity $s>0$, there is a random $m\times n$ matrix $A$ with $p=\frac{s}{m}$ fraction of non-zero elements in each row, such that the nearly isometric property: holds for any vector $x$, provided that the embedding dimension is at least and that $p\leqslant \frac{1}{30},\epsilon\leqslant p\log(1/2p)$, where $h$ is the "Bennet f

Theorems & Definitions (2)

Theorem 1: Poisson Tails of Sparse Johnson-Lindenstrauss Lemma
Lemma 2: pollard2015mini

Accurate Analysis of Sparse Random Projections

TL;DR

Abstract

Accurate Analysis of Sparse Random Projections

Authors

TL;DR

Abstract

Table of Contents

Key Result

Theorems & Definitions (2)