Table of Contents
Fetching ...

Differentially Private High-Dimensional Approximate Range Counting, Revisited

Martin Aumüller, Fabrizio Boninsegna, Francesco Silvestri

TL;DR

This work addresses differentially private, high-dimensional range counting for inner-product (cosine) similarity by revisiting Locality Sensitive Filters to derive a simple, tunable Top-1 data structure (DPTop-1) and a principled reduction from ANN to ANNC that yields DP-ANNC with tractable privacy guarantees. The approach leverages Gaussian projections and concomitant order statistics to set a threshold and aggregate counts, with privacy implemented via Truncated Laplace or Max Projection mechanisms; tensorization further improves preprocessing, space, and query-time trade-offs. The paper shows that these methods achieve competitive accuracy with the state of the art (Andoni et al., NeurIPS 2023/2024) while offering broader parameter ranges and simpler analysis, including the ability to operate under pure DP in some regimes and to trade extra space/time for improved additive error via unbalanced structures. It also introduces CloseTop-1 and TensorCloseTop-1 to remove asymptotic assumptions and to realize near-linear preprocessing and linear space, enabling practical deployment and broader applicability to Euclidean and spherical settings. Open questions include better understanding the fundamental noise-vs.-far-point error trade-offs and conducting experimental comparisons to validate performance in practice.

Abstract

Locality Sensitive Filters are known for offering a quasi-linear space data structure with rigorous guarantees for the Approximate Near Neighbor search (ANN) problem. Building on Locality Sensitive Filters, we derive a simple data structure for the Approximate Near Neighbor Counting (ANNC) problem under differential privacy (DP). Moreover, we provide a simple analysis leveraging a connection with concomitant statistics and extreme value theory. Our approach produces a simple data structure with a tunable parameter that regulates a trade-off between space-time and utility. Through this trade-off, our data structure achieves the same performance as the recent findings of Andoni et al. (NeurIPS 2023) while offering better utility at the cost of higher space and query time. In addition, we provide a more efficient algorithm under pure $\varepsilon$-DP and elucidate the connection between ANN and differentially private ANNC. As a side result, the paper provides a more compact description and analysis of Locality Sensitive Filters for Fair Near Neighbor Search, improving a previous result in Aumüller et al. (TODS 2022).

Differentially Private High-Dimensional Approximate Range Counting, Revisited

TL;DR

This work addresses differentially private, high-dimensional range counting for inner-product (cosine) similarity by revisiting Locality Sensitive Filters to derive a simple, tunable Top-1 data structure (DPTop-1) and a principled reduction from ANN to ANNC that yields DP-ANNC with tractable privacy guarantees. The approach leverages Gaussian projections and concomitant order statistics to set a threshold and aggregate counts, with privacy implemented via Truncated Laplace or Max Projection mechanisms; tensorization further improves preprocessing, space, and query-time trade-offs. The paper shows that these methods achieve competitive accuracy with the state of the art (Andoni et al., NeurIPS 2023/2024) while offering broader parameter ranges and simpler analysis, including the ability to operate under pure DP in some regimes and to trade extra space/time for improved additive error via unbalanced structures. It also introduces CloseTop-1 and TensorCloseTop-1 to remove asymptotic assumptions and to realize near-linear preprocessing and linear space, enabling practical deployment and broader applicability to Euclidean and spherical settings. Open questions include better understanding the fundamental noise-vs.-far-point error trade-offs and conducting experimental comparisons to validate performance in practice.

Abstract

Locality Sensitive Filters are known for offering a quasi-linear space data structure with rigorous guarantees for the Approximate Near Neighbor search (ANN) problem. Building on Locality Sensitive Filters, we derive a simple data structure for the Approximate Near Neighbor Counting (ANNC) problem under differential privacy (DP). Moreover, we provide a simple analysis leveraging a connection with concomitant statistics and extreme value theory. Our approach produces a simple data structure with a tunable parameter that regulates a trade-off between space-time and utility. Through this trade-off, our data structure achieves the same performance as the recent findings of Andoni et al. (NeurIPS 2023) while offering better utility at the cost of higher space and query time. In addition, we provide a more efficient algorithm under pure -DP and elucidate the connection between ANN and differentially private ANNC. As a side result, the paper provides a more compact description and analysis of Locality Sensitive Filters for Fair Near Neighbor Search, improving a previous result in Aumüller et al. (TODS 2022).
Paper Structure (25 sections, 21 theorems, 7 equations, 1 table, 5 algorithms)

This paper contains 25 sections, 21 theorems, 7 equations, 1 table, 5 algorithms.

Key Result

Theorem 1

Consider the asymptotic regime, $\varepsilon > 0$, $\delta \in (0, \frac{1}{2})$, $0\leq\beta<\alpha<1$, and $\alpha - \beta=\Omega(\sqrt{\frac{\log\log n}{\log n}})$. Let $\mathcal{S}=\{x_i\}_{i=1,\dots, n} \subseteq \mathbb{S}^{d-1}$ and let ${\bf{q}} \in \mathbb{S}^{d-1}$. Then DPTop-1 (Algorithm The data structure has pre-processing time $O(d\cdot n^{1+\frac{\rho}{1-\alpha^2}})$, expected quer

Theorems & Definitions (25)

  • Theorem 1
  • Definition 2: $(\alpha, \beta)$-ANN
  • Definition 3: $(\alpha,\beta)$-ANNC
  • Lemma 4: david1974asymptotic
  • Theorem 5: david1974asymptotichall1979rate
  • Definition 6: Approximate Differential Privacy dwork2014algorithmic
  • Lemma 7: Probability to Find a Close Point
  • Lemma 8: Expected Number of Buckets and Far Points
  • Theorem 9
  • Corollary 10: Balanced and Unbalanced Top-1
  • ...and 15 more