Table of Contents
Fetching ...

Learning-Augmented Moment Estimation on Time-Decay Models

Soham Nagawanshi, Shalini Panthangi, Chen Wang, David P. Woodruff, Samson Zhou

TL;DR

This paper utilizes an oracle for the heavy-hitters of datasets to give learning-augmented algorithms for a number of fundamental problems, such as norm/moment estimation, frequency estimation, cascaded norms, and rectangular moment estimation, in the time-decay setting.

Abstract

Motivated by the prevalence and success of machine learning, a line of recent work has studied learning-augmented algorithms in the streaming model. These results have shown that for natural and practical oracles implemented with machine learning models, we can obtain streaming algorithms with improved space efficiency that are otherwise provably impossible. On the other hand, our understanding is much more limited when items are weighted unequally, for example, in the sliding-window model, where older data must be expunged from the dataset, e.g., by privacy regulation laws. In this paper, we utilize an oracle for the heavy-hitters of datasets to give learning-augmented algorithms for a number of fundamental problems, such as norm/moment estimation, frequency estimation, cascaded norms, and rectangular moment estimation, in the time-decay setting. We complement our theoretical results with a number of empirical evaluations that demonstrate the practical efficiency of our algorithms on real and synthetic datasets.

Learning-Augmented Moment Estimation on Time-Decay Models

TL;DR

This paper utilizes an oracle for the heavy-hitters of datasets to give learning-augmented algorithms for a number of fundamental problems, such as norm/moment estimation, frequency estimation, cascaded norms, and rectangular moment estimation, in the time-decay setting.

Abstract

Motivated by the prevalence and success of machine learning, a line of recent work has studied learning-augmented algorithms in the streaming model. These results have shown that for natural and practical oracles implemented with machine learning models, we can obtain streaming algorithms with improved space efficiency that are otherwise provably impossible. On the other hand, our understanding is much more limited when items are weighted unequally, for example, in the sliding-window model, where older data must be expunged from the dataset, e.g., by privacy regulation laws. In this paper, we utilize an oracle for the heavy-hitters of datasets to give learning-augmented algorithms for a number of fundamental problems, such as norm/moment estimation, frequency estimation, cascaded norms, and rectangular moment estimation, in the time-decay setting. We complement our theoretical results with a number of empirical evaluations that demonstrate the practical efficiency of our algorithms on real and synthetic datasets.
Paper Structure (25 sections, 31 theorems, 20 equations, 6 figures, 1 table)

This paper contains 25 sections, 31 theorems, 20 equations, 6 figures, 1 table.

Key Result

Proposition 1

Let $f$ be an $(\alpha, \beta)$-smooth function, and let $\textnormal{ALG}\xspace$ be a streaming algorithm that outputs $f({\mathbf{x}}\xspace)$ by the end of the stream, where ${\mathbf{x}}\xspace$ is the frequency vector of the stream. Suppose $\textnormal{ALG}\xspace$ uses $g$ space and performs

Figures (6)

  • Figure 1: Example of smooth histogram framework. Here $\textnormal{ALG}\xspace^{(t_2)}$ and $\textnormal{ALG}\xspace^{(t_3)}$ sandwich the active elements and are thus good approximations of the sliding window.
  • Figure 2: Experiments for $\ell_2$ norm estimation on CAIDA. Note on the notation: the variable $n$ in figures refers to stream length (which is $m$ at other places of the paper) .
  • Figure 3: Experiments for $\ell_3$ estimation on CAIDA using multiple oracles
  • Figure 4: Experiments for $\ell_3$ estimation on AOL
  • Figure 5: Experiments for $\ell_3$ estimation on synthetic data
  • ...and 1 more figures

Theorems & Definitions (46)

  • Definition 1: Heavy-hitter oracles
  • Definition 2: Suffix-compatible heavy-hitter oracles
  • Definition 3: Common suffix-augmented frequency vectors
  • Definition 4: $(\alpha,\beta)$-smooth functions, Definition 1 of BravermanO07
  • Proposition 1: Exact algorithms, Theorem 1 of BravermanO07
  • Proposition 2: Approximate algorithms, Theorem 2 & 3 of BravermanO07
  • Lemma 3.1
  • proof
  • Theorem 1: Learning-augmented $F_p$ frequency moment algorithm
  • Theorem 2: Learning-augmented $F_p$ frequency algorithm with stochastic oracles
  • ...and 36 more