Oja's Algorithm for Streaming Sparse PCA
Syamantak Kumar, Purnamrita Sarkar
TL;DR
This work analyzes Oja's streaming PCA algorithm in the high-dimensional sparse regime where the leading eigenvector $v_1$ is $s$-sparse. It introduces a simple one-pass thresholded Oja method combined with a support-recovery step and a data-splitting scheme, yielding minimax-optimal sparse PCA guarantees in $O(d)$ space and $O(nd)$ time under milder regularity than prior work. A novel entrywise analysis of the unnormalized Oja vector, together with a two-by-two linear-recursion framework, underpins the support recovery and sparse-PCA guarantees, while probabilistic boosting converts constant-probability results into high-probability outcomes. The results demonstrate that, in streaming settings with subgaussian data and a general covariance, one can achieve global, single-pass sparse PCA with strong statistical guarantees and practical computational efficiency.
Abstract
Oja's algorithm for Streaming Principal Component Analysis (PCA) for $n$ data-points in a $d$ dimensional space achieves the same sin-squared error $O(r_{\mathsf{eff}}/n)$ as the offline algorithm in $O(d)$ space and $O(nd)$ time and a single pass through the datapoints. Here $r_{\mathsf{eff}}$ is the effective rank (ratio of the trace and the principal eigenvalue of the population covariance matrix $Σ$). Under this computational budget, we consider the problem of sparse PCA, where the principal eigenvector of $Σ$ is $s$-sparse, and $r_{\mathsf{eff}}$ can be large. In this setting, to our knowledge, \textit{there are no known single-pass algorithms} that achieve the minimax error bound in $O(d)$ space and $O(nd)$ time without either requiring strong initialization conditions or assuming further structure (e.g., spiked) of the covariance matrix. We show that a simple single-pass procedure that thresholds the output of Oja's algorithm (the Oja vector) can achieve the minimax error bound under some regularity conditions in $O(d)$ space and $O(nd)$ time. We present a nontrivial and novel analysis of the entries of the unnormalized Oja vector, which involves the projection of a product of independent random matrices on a random initial vector. This is completely different from previous analyses of Oja's algorithm and matrix products, which have been done when the $r_{\mathsf{eff}}$ is bounded.
