Differentially Private Permutation Tests: Applications to Kernel Methods

Ilmun Kim; Antonin Schrab

Differentially Private Permutation Tests: Applications to Kernel Methods

Ilmun Kim, Antonin Schrab

TL;DR

The paper tackles private hypothesis testing by introducing differentially private permutation tests that preserve finite-sample validity under $(\varepsilon,\delta)$-DP. It develops a refined privatization via a quantile representation, enabling finite-sample control of type I error while achieving meaningful power guarantees. The authors instantiate the framework with kernel-based statistics, deriving dpMMD and dpHSIC, and establish minimax optimal separation rates across privacy regimes, along with negative results for U-statistic based private tests. Extensive simulations on synthetic data and CelebA images demonstrate strong empirical performance and practical viability, with open-source code to enable adoption. Overall, the work provides a principled, scalable approach to privacy-preserving hypothesis testing with strong theoretical guarantees and practical kernel-based tools.

Abstract

Recent years have witnessed growing concerns about the privacy of sensitive data. In response to these concerns, differential privacy has emerged as a rigorous framework for privacy protection, gaining widespread recognition in both academic and industrial circles. While substantial progress has been made in private data analysis, existing methods often suffer from impracticality or a significant loss of statistical efficiency. This paper aims to alleviate these concerns in the context of hypothesis testing by introducing differentially private permutation tests. The proposed framework extends classical non-private permutation tests to private settings, maintaining both finite-sample validity and differential privacy in a rigorous manner. The power of the proposed test depends on the choice of a test statistic, and we establish general conditions for consistency and non-asymptotic uniform power. To demonstrate the utility and practicality of our framework, we focus on reproducing kernel-based test statistics and introduce differentially private kernel tests for two-sample and independence testing: dpMMD and dpHSIC. The proposed kernel tests are straightforward to implement, applicable to various types of data, and attain minimax optimal power across different privacy regimes. Our empirical evaluations further highlight their competitive power under various synthetic and real-world scenarios, emphasizing their practical value. The code is publicly available to facilitate the implementation of our framework.

Differentially Private Permutation Tests: Applications to Kernel Methods

TL;DR

The paper tackles private hypothesis testing by introducing differentially private permutation tests that preserve finite-sample validity under

-DP. It develops a refined privatization via a quantile representation, enabling finite-sample control of type I error while achieving meaningful power guarantees. The authors instantiate the framework with kernel-based statistics, deriving dpMMD and dpHSIC, and establish minimax optimal separation rates across privacy regimes, along with negative results for U-statistic based private tests. Extensive simulations on synthetic data and CelebA images demonstrate strong empirical performance and practical viability, with open-source code to enable adoption. Overall, the work provides a principled, scalable approach to privacy-preserving hypothesis testing with strong theoretical guarantees and practical kernel-based tools.

Abstract

Paper Structure (103 sections, 39 theorems, 382 equations, 16 figures, 3 algorithms)

This paper contains 103 sections, 39 theorems, 382 equations, 16 figures, 3 algorithms.

Introduction
Related Work
An Overview of Our Results
Organization
Notation
Background: Differential Privacy
Differentially Private Permutation Tests
Proposed Privatization Method
Naive Approach.
Refined Approach.
Validity and Privacy Guarantee
Power Analysis
Application: Differentially Private Kernel Tests
Terminology.
Differentially Private MMD Test
...and 88 more sections

Key Result

Lemma 1

Suppose that an algorithm $\mathcal{A}$ is $(\varepsilon,\delta)$-differentially private. Then for an arbitrary randomized function $f$, the composition $f \circ \mathcal{A}$ also preserves $(\varepsilon,\delta)$-differentially privacy.

Figures (16)

Figure 1: Perturbed uniform $d$-dimensional densities on $[0,1]^d$ with varying perturbation amplitude $a$.
Figure 2: Comparing uniform vs. perturbed uniform while varying the privacy level $\varepsilon$. We set the sample sizes $m = n = 3000$ and dimension $d=1$, and change the privacy level $\varepsilon$ and perturbation amplitude $a$ as follows: (Left) Privacy level $\varepsilon$ from $1/n$ to $10/\sqrt{n}$, perturbation amplitude $a=0.2$. (Middle) Privacy level $\varepsilon$ from $10/\sqrt{n}$ to $1$, perturbation amplitude $a=0.15$. (Right) Privacy level $\varepsilon$ from $1$ to $\sqrt{n}$, perturbation amplitude $a=0.1$.
Figure 3: Comparing uniform vs. perturbed uniform while varying the sample sizes $m=n$. We set the dimension $d=1$ and perturbation amplitude $a=0.1$. We change the privacy level as follows: (Left) Privacy level $\varepsilon=10/\sqrt{n}$. (Middle) Privacy level $\varepsilon=1$. (Right) Privacy level $\varepsilon=\sqrt{n}/10$.
Figure 4: Comparing uniform vs. perturbed uniform while varying the dimension $d$. We set the sample sizes $m = n = 3000$ and perturbation amplitude $a=0.2$. We change the privacy level as follows: (Left) Privacy level $\varepsilon=10/\sqrt{n}$. (Middle) Privacy level $\varepsilon=1$. (Right) Privacy level $\varepsilon=\sqrt{n}/10$.
Figure 5: Selected https://mmlab.ie.cuhk.edu.hk/projects/CelebA.html images in dimension $3\times 178\times 218$.
...and 11 more figures

Theorems & Definitions (60)

Definition 1: Differential Privacy
Lemma 1: Post-Processing
Lemma 2: Composition
Lemma 3: Group Privacy
Definition 2: Global $\ell_p$-Sensitivity
Definition 3: Laplace Mechanism
Lemma 4: Differential Privacy of Laplace Mechanism
Remark 1: Gaussian Mechanism
Example 1: Sensitivity of Integral Probability Metric
Theorem 1: Validity Guarantee
...and 50 more

Differentially Private Permutation Tests: Applications to Kernel Methods

TL;DR

Abstract

Differentially Private Permutation Tests: Applications to Kernel Methods

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (16)

Theorems & Definitions (60)