Optimal Embedding Dimension for Sparse Subspace Embeddings
Shabarish Chenakkod, Michał Dereziński, Xiaoyu Dong, Mark Rudelson
TL;DR
This work resolves the optimal embedding dimension question for sparse oblivious subspace embeddings by proving that m=(1+θ)d is achievable with column sparsity s=O(log^4(d)) and distortion ε=O(1), and that m=O(d/ε^2) is attainable in the presence of leverage-score information via LESS. It introduces independent-diagonals constructions and leverages universality to connect sparse embeddings to Gaussian sketches, enabling fast input-sparsity-time SKETCHING and the first single-pass fast LS algorithms with optimal dimension. The results extend to leverage-score aware non-oblivious embeddings (LESS) and yield fast, practical subspace embeddings with low distortion ε=o(1) and optimal m, suitable for streaming and turnstile settings. Collectively, these findings yield near-optimal, fast sketches for linear regression and related problems, with significant implications for scalable randomized linear algebra and data-efficient dimensionality reduction.
Abstract
A random $m\times n$ matrix $S$ is an oblivious subspace embedding (OSE) with parameters $ε>0$, $δ\in(0,1/3)$ and $d\leq m\leq n$, if for any $d$-dimensional subspace $W\subseteq R^n$, $P\big(\,\forall_{x\in W}\ (1+ε)^{-1}\|x\|\leq\|Sx\|\leq (1+ε)\|x\|\,\big)\geq 1-δ.$ It is known that the embedding dimension of an OSE must satisfy $m\geq d$, and for any $θ> 0$, a Gaussian embedding matrix with $m\geq (1+θ) d$ is an OSE with $ε= O_θ(1)$. However, such optimal embedding dimension is not known for other embeddings. Of particular interest are sparse OSEs, having $s\ll m$ non-zeros per column, with applications to problems such as least squares regression and low-rank approximation. We show that, given any $θ> 0$, an $m\times n$ random matrix $S$ with $m\geq (1+θ)d$ consisting of randomly sparsified $\pm1/\sqrt s$ entries and having $s= O(\log^4(d))$ non-zeros per column, is an oblivious subspace embedding with $ε= O_θ(1)$. Our result addresses the main open question posed by Nelson and Nguyen (FOCS 2013), who conjectured that sparse OSEs can achieve $m=O(d)$ embedding dimension, and it improves on $m=O(d\log(d))$ shown by Cohen (SODA 2016). We use this to construct the first oblivious subspace embedding with $O(d)$ embedding dimension that can be applied faster than current matrix multiplication time, and to obtain an optimal single-pass algorithm for least squares regression. We further extend our results to Leverage Score Sparsification (LESS), which is a recently introduced non-oblivious embedding technique. We use LESS to construct the first subspace embedding with low distortion $ε=o(1)$ and optimal embedding dimension $m=O(d/ε^2)$ that can be applied in current matrix multiplication time.
