A Space Lower Bound for Approximate Membership with Duplicate Insertions or Deletions of Nonelements
Aryan Agarwala, Guy Even
TL;DR
This work establishes fundamental space lower bounds for dynamic filters when duplicate insertions or deletions of nonelements are allowed. By constructing a witness-based dynamic framework and reducing to static filters, the authors prove that, for $u\ge 2n$, the required space is at least $\tfrac{1}{2}(1-\epsilon^{+}-\tfrac{1}{n})\log\binom{u}{n}-O(n)$ bits, approaching a constant fraction of the exact representation $\log\binom{u}{n}$. The approach relies on a sequence of reductions, including a sticky false-positive lemma and a reduction to N-static filters, to derive the lower bound independent of runtime or access patterns and under an oblivious adversary. These results sharpen our understanding of the intrinsic space costs for dynamic approximate membership with relaxed operation semantics, guiding design choices for data structures like Bloom filters, Cuckoo filters, and related dynamic filters. In short, allowing duplicates or deletions of nonelements forces near-exact space costs, revealing a fundamental trade-off between expressiveness of operations and memory efficiency.
Abstract
Designs of data structures for approximate membership queries with false-positive errors that support both insertions and deletions stipulate the following two conditions: (1) Duplicate insertions are prohibited, i.e., it is prohibited to insert an element $x$ if $x$ is currently a member of the dataset. (2) Deletions of nonelements are prohibited, i.e., it is prohibited to delete $x$ if $x$ is not currently a member of the dataset. Under these conditions, the space required for the approximate representation of a datasets of cardinality $n$ with a false-positive probability of $ε^{+}$ is at most $(1+o(1))n\cdot\log_2 (1/ε^{+}) + O(n)$ bits [Bender et al., 2018; Bercea and Even, 2019]. We prove that if these conditions are lifted, then the space required for the approximate representation of datasets of cardinality $n$ from a universe of cardinality $u$ is at least $\frac 12 \cdot (1-ε^{+} -\frac 1n)\cdot \log \binom{u}{n} -O(n)$ bits.
