Table of Contents
Fetching ...

Learning-Augmented Frequent Directions

Anders Aamand, Justin Y. Chen, Siddharth Gollapudi, Sandeep Silwal, Hao Wu

TL;DR

This work develops a learning-augmented framework for streaming problems that extends beyond one-dimensional frequency estimation to high-dimensional matrix streaming. It introduces a deterministic, learning-enhanced Misra-Gries variant that matches state-of-the-art bounds for frequency estimation under Zipfian data, with an error bound of the form $\Theta\left( \frac{1}{m} \cdot \frac{n}{(\ln d)^2} \right)$. It also generalizes Frequent Directions to incorporate learned priors on the top directions, achieving improved space/accuracy tradeoffs; the perfect-prediction case yields $\Theta\left( \frac{1}{(\ln d)^2} \cdot \frac{\|A\|_F^2}{m} \right)$ for the learned variant, while robust variants (RLFD) ensure worst-case guarantees. The paper demonstrates substantial empirical gains on real datasets, with 1–2 orders of magnitude improvements over non-learned baselines and competitive performance against memory-heavy full-matrix SVD. Overall, the work shows that integrating accurate learned predictions into streaming sketches can yield near-optimal, deterministic, and robust algorithms for both frequency estimation and low-rank matrix sketching.

Abstract

An influential paper of Hsu et al. (ICLR'19) introduced the study of learning-augmented streaming algorithms in the context of frequency estimation. A fundamental problem in the streaming literature, the goal of frequency estimation is to approximate the number of occurrences of items appearing in a long stream of data using only a small amount of memory. Hsu et al. develop a natural framework to combine the worst-case guarantees of popular solutions such as CountMin and CountSketch with learned predictions of high frequency elements. They demonstrate that learning the underlying structure of data can be used to yield better streaming algorithms, both in theory and practice. We simplify and generalize past work on learning-augmented frequency estimation. Our first contribution is a learning-augmented variant of the Misra-Gries algorithm which improves upon the error of learned CountMin and learned CountSketch and achieves the state-of-the-art performance of randomized algorithms (Aamand et al., NeurIPS'23) with a simpler, deterministic algorithm. Our second contribution is to adapt learning-augmentation to a high-dimensional generalization of frequency estimation corresponding to finding important directions (top singular vectors) of a matrix given its rows one-by-one in a stream. We analyze a learning-augmented variant of the Frequent Directions algorithm, extending the theoretical and empirical understanding of learned predictions to matrix streaming.

Learning-Augmented Frequent Directions

TL;DR

This work develops a learning-augmented framework for streaming problems that extends beyond one-dimensional frequency estimation to high-dimensional matrix streaming. It introduces a deterministic, learning-enhanced Misra-Gries variant that matches state-of-the-art bounds for frequency estimation under Zipfian data, with an error bound of the form . It also generalizes Frequent Directions to incorporate learned priors on the top directions, achieving improved space/accuracy tradeoffs; the perfect-prediction case yields for the learned variant, while robust variants (RLFD) ensure worst-case guarantees. The paper demonstrates substantial empirical gains on real datasets, with 1–2 orders of magnitude improvements over non-learned baselines and competitive performance against memory-heavy full-matrix SVD. Overall, the work shows that integrating accurate learned predictions into streaming sketches can yield near-optimal, deterministic, and robust algorithms for both frequency estimation and low-rank matrix sketching.

Abstract

An influential paper of Hsu et al. (ICLR'19) introduced the study of learning-augmented streaming algorithms in the context of frequency estimation. A fundamental problem in the streaming literature, the goal of frequency estimation is to approximate the number of occurrences of items appearing in a long stream of data using only a small amount of memory. Hsu et al. develop a natural framework to combine the worst-case guarantees of popular solutions such as CountMin and CountSketch with learned predictions of high frequency elements. They demonstrate that learning the underlying structure of data can be used to yield better streaming algorithms, both in theory and practice. We simplify and generalize past work on learning-augmented frequency estimation. Our first contribution is a learning-augmented variant of the Misra-Gries algorithm which improves upon the error of learned CountMin and learned CountSketch and achieves the state-of-the-art performance of randomized algorithms (Aamand et al., NeurIPS'23) with a simpler, deterministic algorithm. Our second contribution is to adapt learning-augmentation to a high-dimensional generalization of frequency estimation corresponding to finding important directions (top singular vectors) of a matrix given its rows one-by-one in a stream. We analyze a learning-augmented variant of the Frequent Directions algorithm, extending the theoretical and empirical understanding of learned predictions to matrix streaming.

Paper Structure

This paper contains 45 sections, 9 theorems, 51 equations, 18 figures, 3 tables, 3 algorithms.

Key Result

Lemma 2.1

For algorithms that estimate $\tilde{f}{\left( {\vec{v}} \right)}$ by first constructing a matrix $\mathbf{B}$ and then applying the formula $\tilde{f}{\left( {\vec{v}} \right)} = \left\Vert {\mathbf{B} \vec{v}} \right\Vert_2^2$ such that $0 \leq \tilde{f}{\left( {\vec{v}} \right)} \leq f{\left( {\v

Figures (18)

  • Figure 1: Comparison of matrix approximations. The Frequent Directions and learning-augmented Frequent Directions algorithms are streaming algorithms while the exact SVD stores the entire matrix to compute a low-rank approximation (so it cannot be implemented in a stream). For each dataset, the left plot shows median error (error formula from \ref{['eq: def expected error 2 of frequent directions']}) as the rank of the approximation varies while the right plot shows error over the sequence of matrices with a fixed rank of $100$. The sudden drop in error in Eagle corresponds to several frames of a black screen in the video.
  • Figure 2: Comparison of learning-augmented frequency estimation algorithms. Top: CAIDA, Bottom: AOL. For both datasets, the left plot show the median error of each method (across all 50 streams) with varying space budgets. The right plot shows the performance of each algorithm across streams with fixed space of $750$ words. Randomized algorithms are averaged across 10 trials and one standard deviation is shaded.
  • Figure 3: Log-log plot of frequencies for the CAIDA and AOL datasets.
  • Figure 4: Log-log plot of singular values for the first Hyper and Logo matrices.
  • Figure 5: Log-log plot of singular values for the first Eagle and Friends matrices.
  • ...and 13 more figures

Theorems & Definitions (17)

  • Lemma 2.1
  • Proposition 2.2: Liberty22
  • Theorem 3.1: Expected Error of the Misra-Gries Algorithm
  • Theorem 3.2: Expected Error of the Learned Misra-Gries Algorithm
  • Theorem 3.3: Expected Error of the Frequent Directions Algorithm
  • Theorem 3.4: Expected Error of the Learned Frequent Directions Algorithm
  • proof : Proof of Lemma \ref{['lemma: equivalence of weighted error']}
  • proof : Proof of Fact \ref{['lemma: property of frequent direction']}
  • proof : Proof of Theorem \ref{['theorem: Expected Error of the Misra-Gries']}
  • proof : Proof of Theorem \ref{['theorem: Expected Error of the learned Misra-Gries']}
  • ...and 7 more