Table of Contents
Fetching ...

Federated Nonparametric Hypothesis Testing with Differential Privacy Constraints: Optimal Rates and Adaptive Tests

T. Tony Cai, Abhinav Chakraborty, Lasse Vuursteen

TL;DR

The paper tackles federated nonparametric goodness-of-fit testing under distributed $(\epsilon,\delta)$-DP, focusing on the white-noise-with-drift model and Besov smoothness classes. It develops minimax lower and upper bounds for the separation rate $\rho$, revealing multiple phase transitions and showing that access to a shared randomness source can improve testing rates. It then constructs three DP-compliant testing procedures that are optimal (up to logarithmic factors) in different privacy-budget regimes, and extends them to adaptive, data-driven tests when the regularity parameter $s$ is unknown. The results quantify the privacy-cost of distributed testing and provide adaptive strategies with minimal DP overhead, offering guidance for designing privacy-preserving federated hypothesis tests in high-dimensional nonparametric settings.

Abstract

Federated learning has attracted significant recent attention due to its applicability across a wide range of settings where data is collected and analyzed across disparate locations. In this paper, we study federated nonparametric goodness-of-fit testing in the white-noise-with-drift model under distributed differential privacy (DP) constraints. We first establish matching lower and upper bounds, up to a logarithmic factor, on the minimax separation rate. This optimal rate serves as a benchmark for the difficulty of the testing problem, factoring in model characteristics such as the number of observations, noise level, and regularity of the signal class, along with the strictness of the $(ε,δ)$-DP requirement. The results demonstrate interesting and novel phase transition phenomena. Furthermore, the results reveal an interesting phenomenon that distributed one-shot protocols with access to shared randomness outperform those without access to shared randomness. We also construct a data-driven testing procedure that possesses the ability to adapt to an unknown regularity parameter over a large collection of function classes with minimal additional cost, all while maintaining adherence to the same set of DP constraints.

Federated Nonparametric Hypothesis Testing with Differential Privacy Constraints: Optimal Rates and Adaptive Tests

TL;DR

The paper tackles federated nonparametric goodness-of-fit testing under distributed -DP, focusing on the white-noise-with-drift model and Besov smoothness classes. It develops minimax lower and upper bounds for the separation rate , revealing multiple phase transitions and showing that access to a shared randomness source can improve testing rates. It then constructs three DP-compliant testing procedures that are optimal (up to logarithmic factors) in different privacy-budget regimes, and extends them to adaptive, data-driven tests when the regularity parameter is unknown. The results quantify the privacy-cost of distributed testing and provide adaptive strategies with minimal DP overhead, offering guidance for designing privacy-preserving federated hypothesis tests in high-dimensional nonparametric settings.

Abstract

Federated learning has attracted significant recent attention due to its applicability across a wide range of settings where data is collected and analyzed across disparate locations. In this paper, we study federated nonparametric goodness-of-fit testing in the white-noise-with-drift model under distributed differential privacy (DP) constraints. We first establish matching lower and upper bounds, up to a logarithmic factor, on the minimax separation rate. This optimal rate serves as a benchmark for the difficulty of the testing problem, factoring in model characteristics such as the number of observations, noise level, and regularity of the signal class, along with the strictness of the -DP requirement. The results demonstrate interesting and novel phase transition phenomena. Furthermore, the results reveal an interesting phenomenon that distributed one-shot protocols with access to shared randomness outperform those without access to shared randomness. We also construct a data-driven testing procedure that possesses the ability to adapt to an unknown regularity parameter over a large collection of function classes with minimal additional cost, all while maintaining adherence to the same set of DP constraints.
Paper Structure (38 sections, 49 theorems, 408 equations, 2 figures, 1 table)

This paper contains 38 sections, 49 theorems, 408 equations, 2 figures, 1 table.

Key Result

Theorem 1

Let ${s},R > 0$ be given and consider any sequences of natural numbers $m \equiv m_N$ and $n := N/m$ such that $N = mn \to \infty$, $1/N \ll \sigma \equiv \sigma_N = O(1)$, $\epsilon \equiv \epsilon_N$ in $(N^{-1},1]$ and $\delta \equiv \delta_N \lesssim N^{-(1 + \omega)}$ for any constant $\omega >

Figures (2)

  • Figure 1: Illustration of federated $(\epsilon,\delta)$-DP-constrained testing.
  • Figure 2: The relationship of the minimax testing rate $\rho$ and $\epsilon$, given by \ref{['eq:local_randomness_rate_single_line']} and \ref{['eq:shared_randomness_rate_single_line']}, for $(n,m)=(5,5)$ in the left column and $(n,m)=(2,15)$ in the right column, $\sigma=1$ and smoothness levels $s=1/5$, $s=1/2$, $s=1$ and $s=3$. The panels on the first row correspond to distributed $(\epsilon,\delta)$-DP (local randomness only) protocols (i.e. \ref{['eq:local_randomness_rate_single_line']}), the bottom row corresponds to distributed $(\epsilon,\delta)$-DP protocols with shared randomness (i.e. \ref{['eq:shared_randomness_rate_single_line']}). The regimes correspond to the six regimes (e.g. different rates) in Table \ref{['tab:rate_table']}.

Theorems & Definitions (87)

  • Definition 1
  • Theorem 1
  • Theorem 2
  • Theorem 3
  • Theorem 4
  • Lemma 1
  • Lemma 2
  • Lemma 3
  • Lemma 4
  • Theorem 5
  • ...and 77 more