On the Adversarial Robustness of Online Importance Sampling
Yotam Kenneth-Mordoch, Shay Sapir
TL;DR
<3-5 sentence high-level summary> The paper analyzes the adversarial robustness of online importance sampling in streaming data, introducing a self-weighted online sampling framework that preserves (1+ε) accuracy for sums even under adaptive adversaries. It establishes a one-dimensional robustness result and extends it to multi-dimensional objectives via a coreset-based approach, enabling robust solutions across problems. The authors apply this to two fundamental tasks—hypergraph cut sparsification and ℓ_p subspace embedding—providing near-optimal space bounds and a generic black-box wrapper that combines online sampling with merge-and-reduce to achieve robust, scalable streaming algorithms. These contributions close gaps to oblivious online performance, improve previous adversarial bounds, and offer a unified technique applicable to a broad class of coreset-style problems in adaptive streams.
Abstract
This paper studies the adversarial-robustness of importance-sampling (aka sensitivity sampling); a useful algorithmic technique that samples elements with probabilities proportional to some measure of their importance. A streaming or online algorithm is called adversarially-robust if it succeeds with high probability on input streams that may change adaptively depending on previous algorithm outputs. Unfortunately, the dependence between stream elements breaks the analysis of most randomized algorithms, and in particular that of importance-sampling algorithms. Previously, Braverman et al. [NeurIPS 2021] suggested that streaming algorithms based on importance-sampling may be adversarially-robust; however, they proved it only for well-behaved inputs. We focus on the adversarial-robustness of online importance-sampling, a natural variant where sampling decisions are irrevocable and made as data arrives. Our main technical result shows that, given as input an adaptive stream of elements $x_1,\ldots,x_T\in \mathbb{R}_+$, online importance-sampling maintains a $(1\pmε)$-approximation of their sum while matching (up to lower order terms) the storage guarantees of the oblivious (non-adaptive) case. We then apply this result to develop adversarially-robust online algorithms for two fundamental problems: hypergraph cut sparsification and $\ell_p$ subspace embedding.
