On the (Generative) Linear Sketching Problem

Xinyu Yuan; Yan Qiao; Zonghui Wang; Wenzhi Chen

On the (Generative) Linear Sketching Problem

Xinyu Yuan, Yan Qiao, Zonghui Wang, Wenzhi Chen

Abstract

Sketch techniques have been extensively studied in recent years and are especially well-suited to data streaming scenarios, where the sketch summary is updated quickly and compactly. However, it is challenging to recover the current state from these summaries in a way that is accurate, fast, and real. In this paper, we seek a solution that reconciles this tension, aiming for near-perfect recovery with lightweight computational procedures. Focusing on linear sketching problems of the form $\boldsymbolΦf \rightarrow f$, our study proceeds in three stages. First, we dissect existing techniques and show the root cause of the sketching dilemma: an orthogonal information loss. Second, we examine how generative priors can be leveraged to bridge the information gap. Third, we propose FLORE, a novel generative sketching framework that embraces these analyses to achieve the best of all worlds. More importantly, FLORE can be trained without access to ground-truth data. Comprehensive evaluations demonstrate FLORE's ability to provide high-quality recovery, and support summary with low computing overhead, outperforming previous methods by up to 1000 times in error reduction and 100 times in processing speed compared to learning-based solutions.

On the (Generative) Linear Sketching Problem

Abstract

, our study proceeds in three stages. First, we dissect existing techniques and show the root cause of the sketching dilemma: an orthogonal information loss. Second, we examine how generative priors can be leveraged to bridge the information gap. Third, we propose FLORE, a novel generative sketching framework that embraces these analyses to achieve the best of all worlds. More importantly, FLORE can be trained without access to ground-truth data. Comprehensive evaluations demonstrate FLORE's ability to provide high-quality recovery, and support summary with low computing overhead, outperforming previous methods by up to 1000 times in error reduction and 100 times in processing speed compared to learning-based solutions.

Paper Structure (56 sections, 10 theorems, 41 equations, 29 figures, 2 tables, 6 algorithms)

This paper contains 56 sections, 10 theorems, 41 equations, 29 figures, 2 tables, 6 algorithms.

Introduction
Motivating Analyses
Problem Formulation
Limitations of Existing Techniques
The Anatomy of Failure
Deep Generative Linear Sketching
Integrating Generative Priors
FLORE: FLow-based Orthogonal REcovery
Evaluation
Per-Element Frequency Estimation
Heavy-Hitter Detection
Estimation of Distribution & Entropy
Processing Efficiency
Other Experiments
Conclusion
...and 41 more sections

Key Result

Proposition 2.1

Suppose that for $p>0$, the error of best $s$-term approximation is denoted by $\sigma_s(f)_p \coloneqq \inf_{\|z\|_0 \leq s} \| f-z \|_p$. Then, for any vector $f \in \mathbb{C}^N$, a CS solution $f^{\star}$ with $b = \boldsymbol{\Phi} f + \boldsymbol{\epsilon}$ and $\|\boldsymbol{\epsilon}\|_2 \le where the constant $C = \frac{2(1+\rho)}{1-\rho}$ and $\rho \propto \delta_{2s}$.

Figures (29)

Figure 1: Comparison of our generative sketch solution and existing solutions. From (a) to (c), we present an overview of the full pipelines of three sketching frameworks. In (d), our proposal achieves the best trade-off among accuracy, fidelity and efficiency.
Figure 2: Matrix formulation of Count-Min Sketch. The linear sketching problem can be analyzed via simple matrix operations.
Figure 3: Compare compressive sensing with sketch techniques.
Figure 4: Illustration of our key insights. (a) The original streaming data cannot be recovered due to orthogonal information loss in the null space. (b) We leverage GMs to generate the null-space component which is in harmony with the range-space component.
Figure 5: Performance comparison across different GMs. FGM achieves the most favorable trade-off in sketching tasks (higher the better). Detailed results are provided in Appendix \ref{['app:gm_select']}.
...and 24 more figures

Theorems & Definitions (27)

Proposition 2.1
Theorem 2.2
Proposition 2.3
Theorem 3.1
Theorem 3.2
Theorem 3.3
Definition 5.1
Definition 5.2
Definition 5.3
Definition 5.4
...and 17 more

On the (Generative) Linear Sketching Problem

Abstract

On the (Generative) Linear Sketching Problem

Authors

Abstract

Table of Contents

Key Result

Figures (29)

Theorems & Definitions (27)