Table of Contents
Fetching ...

Spectral Guarantees for Adversarial Streaming PCA

Eric Price, Zhiyang Xun

TL;DR

This work addresses streaming PCA under adversarial data order, focusing on how large a spectral gap $R$ must be to achieve near-linear space for estimating the top eigenvector. It demonstrates that Oja's algorithm, when adapted to adversarial streams with a fixed learning rate, attains $o(1)$ error in insertion-only settings for $R = O(\log n\log d)$, delivering near-linear space usage; it also introduces a practical variant that can declare failure when norms are unfavorable. The authors prove fundamental lower bounds: any mergeable-summaries approach requires $\Omega(d^2/R^2)$ space for 0.1-approximation, and there is a phase transition showing $\varepsilon$-approximation demands rise to $\Omega(d^2/R^3)$ space for sufficiently large $d$, with constant $R$ forcing $\Omega(d^2)$ space. Overall, the paper provides the first spectral-tail analysis of Oja's method in adversarial streaming, clarifying when near-linear space is achievable and illustrating a separation between mergeable-summaries and insertion-only models, with implications for designing space-efficient PCA in streaming environments.

Abstract

In streaming PCA, we see a stream of vectors $x_1, \dotsc, x_n \in \mathbb{R}^d$ and want to estimate the top eigenvector of their covariance matrix. This is easier if the spectral ratio $R = λ_1 / λ_2$ is large. We ask: how large does $R$ need to be to solve streaming PCA in $\widetilde{O}(d)$ space? Existing algorithms require $R = \widetildeΩ(d)$. We show: (1) For all mergeable summaries, $R = \widetildeΩ(\sqrt{d})$ is necessary. (2) In the insertion-only model, a variant of Oja's algorithm gets $o(1)$ error for $R = O(\log n \log d)$. (3) No algorithm with $o(d^2)$ space gets $o(1)$ error for $R = O(1)$. Our analysis is the first application of Oja's algorithm to adversarial streams. It is also the first algorithm for adversarial streaming PCA that is designed for a spectral, rather than Frobenius, bound on the tail; and the bound it needs is exponentially better than is possible by adapting a Frobenius guarantee.

Spectral Guarantees for Adversarial Streaming PCA

TL;DR

This work addresses streaming PCA under adversarial data order, focusing on how large a spectral gap must be to achieve near-linear space for estimating the top eigenvector. It demonstrates that Oja's algorithm, when adapted to adversarial streams with a fixed learning rate, attains error in insertion-only settings for , delivering near-linear space usage; it also introduces a practical variant that can declare failure when norms are unfavorable. The authors prove fundamental lower bounds: any mergeable-summaries approach requires space for 0.1-approximation, and there is a phase transition showing -approximation demands rise to space for sufficiently large , with constant forcing space. Overall, the paper provides the first spectral-tail analysis of Oja's method in adversarial streaming, clarifying when near-linear space is achievable and illustrating a separation between mergeable-summaries and insertion-only models, with implications for designing space-efficient PCA in streaming environments.

Abstract

In streaming PCA, we see a stream of vectors and want to estimate the top eigenvector of their covariance matrix. This is easier if the spectral ratio is large. We ask: how large does need to be to solve streaming PCA in space? Existing algorithms require . We show: (1) For all mergeable summaries, is necessary. (2) In the insertion-only model, a variant of Oja's algorithm gets error for . (3) No algorithm with space gets error for . Our analysis is the first application of Oja's algorithm to adversarial streams. It is also the first algorithm for adversarial streaming PCA that is designed for a spectral, rather than Frobenius, bound on the tail; and the bound it needs is exponentially better than is possible by adapting a Frobenius guarantee.
Paper Structure (22 sections, 39 theorems, 139 equations, 3 figures, 1 table, 2 algorithms)

This paper contains 22 sections, 39 theorems, 139 equations, 3 figures, 1 table, 2 algorithms.

Key Result

Theorem 1.1

For any sufficiently large universal constant $C$, suppose $\eta$ is such that $\eta n\lambda_1 > C\log d$ and $\eta n\lambda_2 < \frac{1}{C\log n}$. If $\eta \left\lVert x_i\right\rVert^2 \leq 1$ for every $i$, then Oja's algorithm with learning rate $\eta$ returns $\widehat{v}$ satisfying $\left\l

Figures (3)

  • Figure 1: Suppose $\eta = 1$. Then even after convergence to $v^*$ exactly, a single final sample can skew the result by $\Theta(\sqrt{\sigma_2})$. For smaller $\eta$, the same can happen with $\frac{1}{\eta}$ final samples.
  • Figure 2: Lemma \ref{['lem:matsamplesimple']} states that, if the sum of squared distances across any subsequence of vectors $A_i$ is at most $B$, then the vector selecting the maximum value in each coordinate has squared norm $B \log^2 n$.
  • Figure 3: High-accuracy lower bound approach: Alice inserts a sequence of random bits (all but the last row). Bob knows the left side and wants to approximate the right side. To estimate the blue bits on the right, he adds $O(1)$ vectors using the corresponding red bits on the left and random bits on the right. With high probability, the principal component has constant correlation with the blue bits.

Theorems & Definitions (70)

  • Theorem 1.1: Performance of Oja's method in adversarial streams
  • Theorem 1.2: Full algorithm
  • Theorem 1.3: Mergeable Lower Bound
  • Theorem 1.4: Accuracy Lower Bound
  • Theorem 1.5
  • Lemma 2.0: Growth implies correctness
  • Lemma 2.0
  • Lemma 2.1: Simplified version of Lemma \ref{['lem:matsample']}
  • Remark 2.2
  • Lemma 2.2
  • ...and 60 more