Table of Contents
Fetching ...

Query-Efficient Locally Private Hypothesis Selection via the Scheffe Graph

Gautam Kamath, Alireza F. Pour, Matthew Regehr, David P. Woodruff

TL;DR

This work addresses hypothesis selection under local differential privacy by introducing a non-interactive RMDE framework that uses a carefully constructed query set to compare candidate distributions. Central to the method is the Scheffé graph, a combinatorial structure whose dominating set yields a compact, information-rich query family, enabling $\tilde{O}(k^{3/2})$ sample complexity. The authors show how to implement RMDE under $\varepsilon$-LDP via randomized response, obtaining a provable error bound $\|\hat{q}-p\|_1 \le 13\min_{q\in Q}\|q-p\|_1 + \alpha$ with high probability, and provide a $t$-round extension with quantified dependence on $t$. They also prove near-tight barriers: a triangular-substructure lower bound and a counterexample to a flattening conjecture, which together indicate that further improvements require new structural ideas beyond the Scheffé-graph approach. Overall, the paper delivers a near-optimal, non-interactive LDP solution for hypothesis selection and introduces structural tools that may generalize to broader private hypothesis-testing problems.

Abstract

We propose an algorithm with improved query-complexity for the problem of hypothesis selection under local differential privacy constraints. Given a set of $k$ probability distributions $Q$, we describe an algorithm that satisfies local differential privacy, performs $\tilde{O}(k^{3/2})$ non-adaptive queries to individuals who each have samples from a probability distribution $p$, and outputs a probability distribution from the set $Q$ which is nearly the closest to $p$. Previous algorithms required either $Ω(k^2)$ queries or many rounds of interactive queries. Technically, we introduce a new object we dub the Scheffé graph, which captures structure of the differences between distributions in $Q$, and may be of more broad interest for hypothesis selection tasks.

Query-Efficient Locally Private Hypothesis Selection via the Scheffe Graph

TL;DR

This work addresses hypothesis selection under local differential privacy by introducing a non-interactive RMDE framework that uses a carefully constructed query set to compare candidate distributions. Central to the method is the Scheffé graph, a combinatorial structure whose dominating set yields a compact, information-rich query family, enabling sample complexity. The authors show how to implement RMDE under -LDP via randomized response, obtaining a provable error bound with high probability, and provide a -round extension with quantified dependence on . They also prove near-tight barriers: a triangular-substructure lower bound and a counterexample to a flattening conjecture, which together indicate that further improvements require new structural ideas beyond the Scheffé-graph approach. Overall, the paper delivers a near-optimal, non-interactive LDP solution for hypothesis selection and introduces structural tools that may generalize to broader private hypothesis-testing problems.

Abstract

We propose an algorithm with improved query-complexity for the problem of hypothesis selection under local differential privacy constraints. Given a set of probability distributions , we describe an algorithm that satisfies local differential privacy, performs non-adaptive queries to individuals who each have samples from a probability distribution , and outputs a probability distribution from the set which is nearly the closest to . Previous algorithms required either queries or many rounds of interactive queries. Technically, we introduce a new object we dub the Scheffé graph, which captures structure of the differences between distributions in , and may be of more broad interest for hypothesis selection tasks.

Paper Structure

This paper contains 11 sections, 12 theorems, 34 equations.

Key Result

Theorem 1

Given a set of $k$ distributions $Q$ and $\tilde{O}(k^{5/2})$ expected preprocessing timePreprocessing involves computing many probabilities $q(E)$ for $q \in Q$, which we treat as constant-time., there exists a non-interactive $\varepsilon$-locally differentially private algorithm with the followin such that given $n \geq n_0$ samples from a distribution $p$, then with probability at least $1 - \

Theorems & Definitions (26)

  • Theorem 1
  • Corollary 2
  • Definition 3: DworkMNS06
  • Definition 4: Warner65EvfimievskiGS03KasiviswanathanLNRS11
  • Definition 5
  • Definition 6
  • Lemma 7
  • Definition 8: Relaxed Minimum Distance Estimator (RMDE)
  • Theorem 9
  • proof
  • ...and 16 more