Table of Contents
Fetching ...

Refining the Adaptivity Notion in the Huge Object Model

Tomer Adar, Eldar Fischer

TL;DR

This work investigates the adaptivity landscape in the Huge Object distribution-testing model, moving beyond the classic adaptive vs non-adaptive dichotomy by introducing a hierarchy of intermediate models such as locally-bounded, forward-only, and memory-bounded adaptivity. The authors develop formal definitions, reductions, and novel techniques (notably the split-adaptive model and query foresight) to establish exponential separations between consecutive models, showing that small changes in information flow between samples yield large query-complexity gaps. They provide concrete testers for several properties (e.g., All-zero, Determinism, bounded support, Inv, Sym, Par_k) and prove tight lower bounds in restricted models, including polynomial lower bounds for Inv and Sym in various adaptivity settings. The results illuminate a rich, multi-layered landscape of adaptivity in Huge Object testing with potential implications for streaming-like and distributed computation paradigms, suggesting broad applicability of the methods to other two-scale query models and to distributed-data testing tasks.

Abstract

The Huge Object model for distribution testing, first defined by Goldreich and Ron in 2022, combines the features of classical string testing and distribution testing. In this model we are given access to independent samples from an unknown distribution $P$ over the set of strings $\{0,1\}^n$, but are only allowed to query a few bits from the samples. The distinction between adaptive and non-adaptive algorithms, which occurs naturally in the realm of string testing (while being irrelevant for classical distribution testing), plays a substantial role also in the Huge Object model. In this work we show that the full picture in the Huge Object model is much richer than just that of the ``adaptive vs. non-adaptive'' dichotomy. We define and investigate several models of adaptivity that lie between the fully-adaptive and the completely non-adaptive extremes. These models are naturally grounded by observing the querying process from each sample independently, and considering the ``algorithmic flow'' between them. For example, if we allow no information at all to cross over between samples (up to the final decision), then we obtain the locally bounded adaptive model, arguably the ``least adaptive'' one apart from being completely non-adaptive. A slightly stronger model allows only a ``one-way'' information flow. Even stronger (but still far from being fully adaptive) models follow by taking inspiration from the setting of streaming algorithms. To show that we indeed have a hierarchy, we prove a chain of exponential separations encompassing most of the models that we define.

Refining the Adaptivity Notion in the Huge Object Model

TL;DR

This work investigates the adaptivity landscape in the Huge Object distribution-testing model, moving beyond the classic adaptive vs non-adaptive dichotomy by introducing a hierarchy of intermediate models such as locally-bounded, forward-only, and memory-bounded adaptivity. The authors develop formal definitions, reductions, and novel techniques (notably the split-adaptive model and query foresight) to establish exponential separations between consecutive models, showing that small changes in information flow between samples yield large query-complexity gaps. They provide concrete testers for several properties (e.g., All-zero, Determinism, bounded support, Inv, Sym, Par_k) and prove tight lower bounds in restricted models, including polynomial lower bounds for Inv and Sym in various adaptivity settings. The results illuminate a rich, multi-layered landscape of adaptivity in Huge Object testing with potential implications for streaming-like and distributed computation paradigms, suggesting broad applicability of the methods to other two-scale query models and to distributed-data testing tasks.

Abstract

The Huge Object model for distribution testing, first defined by Goldreich and Ron in 2022, combines the features of classical string testing and distribution testing. In this model we are given access to independent samples from an unknown distribution over the set of strings , but are only allowed to query a few bits from the samples. The distinction between adaptive and non-adaptive algorithms, which occurs naturally in the realm of string testing (while being irrelevant for classical distribution testing), plays a substantial role also in the Huge Object model. In this work we show that the full picture in the Huge Object model is much richer than just that of the ``adaptive vs. non-adaptive'' dichotomy. We define and investigate several models of adaptivity that lie between the fully-adaptive and the completely non-adaptive extremes. These models are naturally grounded by observing the querying process from each sample independently, and considering the ``algorithmic flow'' between them. For example, if we allow no information at all to cross over between samples (up to the final decision), then we obtain the locally bounded adaptive model, arguably the ``least adaptive'' one apart from being completely non-adaptive. A slightly stronger model allows only a ``one-way'' information flow. Even stronger (but still far from being fully adaptive) models follow by taking inspiration from the setting of streaming algorithms. To show that we indeed have a hierarchy, we prove a chain of exponential separations encompassing most of the models that we define.
Paper Structure (43 sections, 36 theorems, 60 equations, 1 figure, 8 algorithms)

This paper contains 43 sections, 36 theorems, 60 equations, 1 figure, 8 algorithms.

Key Result

Lemma 2.9

For a property $\mathcal{P}$ of distributions over strings, and any distribution $P\in\mathcal{D}(\Sigma^n)$, there is a distribution realizing the distance of $P$ from $\mathcal{P}$, i.e. a distribution $Q\in\mathcal{P}_n$ for which $d(P,Q)=d(P,\mathcal{P}_n)$. In particular, the infimum in Definit

Figures (1)

  • Figure 1: Graphical summary of our results

Theorems & Definitions (127)

  • Definition 2.1: Common notations
  • Definition 2.2: Set of distributions
  • Definition 2.3: Property
  • Definition 2.4: Normalized Hamming distance
  • Definition 2.5: Variation distance
  • Definition 2.6: Transfer distribution
  • Definition 2.7: Earth Mover's Distance
  • Definition 2.8: Distance from a property
  • Lemma 2.9
  • Definition 2.10: Fully adaptive algorithm
  • ...and 117 more