Table of Contents
Fetching ...

A Space Lower Bound for Approximate Membership with Duplicate Insertions or Deletions of Nonelements

Aryan Agarwala, Guy Even

TL;DR

This work establishes fundamental space lower bounds for dynamic filters when duplicate insertions or deletions of nonelements are allowed. By constructing a witness-based dynamic framework and reducing to static filters, the authors prove that, for $u\ge 2n$, the required space is at least $\tfrac{1}{2}(1-\epsilon^{+}-\tfrac{1}{n})\log\binom{u}{n}-O(n)$ bits, approaching a constant fraction of the exact representation $\log\binom{u}{n}$. The approach relies on a sequence of reductions, including a sticky false-positive lemma and a reduction to N-static filters, to derive the lower bound independent of runtime or access patterns and under an oblivious adversary. These results sharpen our understanding of the intrinsic space costs for dynamic approximate membership with relaxed operation semantics, guiding design choices for data structures like Bloom filters, Cuckoo filters, and related dynamic filters. In short, allowing duplicates or deletions of nonelements forces near-exact space costs, revealing a fundamental trade-off between expressiveness of operations and memory efficiency.

Abstract

Designs of data structures for approximate membership queries with false-positive errors that support both insertions and deletions stipulate the following two conditions: (1) Duplicate insertions are prohibited, i.e., it is prohibited to insert an element $x$ if $x$ is currently a member of the dataset. (2) Deletions of nonelements are prohibited, i.e., it is prohibited to delete $x$ if $x$ is not currently a member of the dataset. Under these conditions, the space required for the approximate representation of a datasets of cardinality $n$ with a false-positive probability of $ε^{+}$ is at most $(1+o(1))n\cdot\log_2 (1/ε^{+}) + O(n)$ bits [Bender et al., 2018; Bercea and Even, 2019]. We prove that if these conditions are lifted, then the space required for the approximate representation of datasets of cardinality $n$ from a universe of cardinality $u$ is at least $\frac 12 \cdot (1-ε^{+} -\frac 1n)\cdot \log \binom{u}{n} -O(n)$ bits.

A Space Lower Bound for Approximate Membership with Duplicate Insertions or Deletions of Nonelements

TL;DR

This work establishes fundamental space lower bounds for dynamic filters when duplicate insertions or deletions of nonelements are allowed. By constructing a witness-based dynamic framework and reducing to static filters, the authors prove that, for , the required space is at least bits, approaching a constant fraction of the exact representation . The approach relies on a sequence of reductions, including a sticky false-positive lemma and a reduction to N-static filters, to derive the lower bound independent of runtime or access patterns and under an oblivious adversary. These results sharpen our understanding of the intrinsic space costs for dynamic approximate membership with relaxed operation semantics, guiding design choices for data structures like Bloom filters, Cuckoo filters, and related dynamic filters. In short, allowing duplicates or deletions of nonelements forces near-exact space costs, revealing a fundamental trade-off between expressiveness of operations and memory efficiency.

Abstract

Designs of data structures for approximate membership queries with false-positive errors that support both insertions and deletions stipulate the following two conditions: (1) Duplicate insertions are prohibited, i.e., it is prohibited to insert an element if is currently a member of the dataset. (2) Deletions of nonelements are prohibited, i.e., it is prohibited to delete if is not currently a member of the dataset. Under these conditions, the space required for the approximate representation of a datasets of cardinality with a false-positive probability of is at most bits [Bender et al., 2018; Bercea and Even, 2019]. We prove that if these conditions are lifted, then the space required for the approximate representation of datasets of cardinality from a universe of cardinality is at least bits.
Paper Structure (16 sections, 6 theorems, 24 equations)

This paper contains 16 sections, 6 theorems, 24 equations.

Key Result

Theorem 10

For every N-Static filter $F\in F_{\mathsf{static}}(u,n,p_{\mathsf{fail}},\epsilon^{-})$ and for every $\alpha>1$, it holds that

Theorems & Definitions (28)

  • Definition 1: Syntax of Static Filter
  • Definition 2: Semantics of N-Static Filter
  • Definition 3: Space of Static Filter
  • Definition 4: Sequences of Operations and Datasets
  • Definition 5: Duplicate-Insertions and Deletion-of-Nonelements
  • Definition 6: $(u,n)$-Sequences
  • Definition 7: Syntax of a Dynamic Filter
  • Definition 8: Semantics of a Dynamic Filter
  • Definition 9: Dynamic Filter Space
  • Theorem 10
  • ...and 18 more