Sketching, Moment Estimation, and the Lévy-Khintchine Representation Theorem
Seth Pettie, Dingyu Wang
Abstract
In the $d$-dimensional turnstile streaming model, a frequency vector $\mathbf{x}=(\mathbf{x}(1),\ldots,\mathbf{x}(n))\in (\mathbb{R}^d)^n$ is updated entry-wisely over a stream. We consider the problem of \emph{$f$-moment estimation} for which one wants to estimate $$f(\mathbf{x})=\sum_{v\in[n]}f(\mathbf{x}(v))$$ with a small-space sketch. In this work we present a simple and generic scheme to construct sketches with the novel idea of hashing indices to \emph{Lévy processes}, from which one can estimate the $f$-moment $f(\mathbf{x})$ where $f$ is the \emph{characteristic exponent} of the Lévy process. The fundamental \emph{Lévy-Khintchine{} representation theorem} completely characterizes the space of all possible characteristic exponents, which in turn characterizes the set of $f$-moments that can be estimated by this generic scheme. The new scheme has strong explanatory power. It unifies the construction of many existing sketches ($F_0$, $L_0$, $L_2$, $L_α$, $L_{p,q}$, etc.) and it implies the tractability of many nearly periodic functions that were previously unclassified. Furthermore, the scheme can be conveniently generalized to multidimensional cases ($d\geq 2$) by considering multidimensional Lévy processes and can be further generalized to estimate \emph{heterogeneous moments} by projecting different indices with different Lévy processes. We conjecture that the set of tractable functions can be characterized using the Lévy-Khintchine representation theorem via what we called the \emph{Fourier-Hahn-Lévy} method.
