Table of Contents
Fetching ...

Differentially Private Joint Independence Test

Xingwei Liu, Yuexin Chen, Wangli Xu

TL;DR

This paper proposes a dHSIC-based testing procedure by employing a differentially private permutation methodology and investigates the uniform power of the proposed test in dHSIC metric and $L_2$ metric, indicating that the proposed test attains the minimax optimal power across different privacy regimes.

Abstract

Identification of joint dependence among more than two random vectors plays an important role in many statistical applications, where the data may contain sensitive or confidential information. In this paper, we consider the the $d$-variable Hilbert-Schmidt independence criterion (dHSIC) in the context of differential privacy. Given the limiting distribution of the empirical estimate of dHSIC is complicated Gaussian chaos, constructing tests in the non-privacy regime is typically based on permutation and bootstrap. To detect joint dependence in privacy, we propose a dHSIC-based testing procedure by employing a differentially private permutation methodology. Our method enjoys privacy guarantee, valid level and pointwise consistency, while the bootstrap counterpart suffers inconsistent power. We further investigate the uniform power of the proposed test in dHSIC metric and $L_2$ metric, indicating that the proposed test attains the minimax optimal power across different privacy regimes. As a byproduct, our results also contain the pointwise and uniform power of the non-private permutation dHSIC, addressing an unsolved question remained in Pfister et al. (2018). Both numerical simulations and real data analysis on causal inference suggest our proposed test performs well empirically.

Differentially Private Joint Independence Test

TL;DR

This paper proposes a dHSIC-based testing procedure by employing a differentially private permutation methodology and investigates the uniform power of the proposed test in dHSIC metric and metric, indicating that the proposed test attains the minimax optimal power across different privacy regimes.

Abstract

Identification of joint dependence among more than two random vectors plays an important role in many statistical applications, where the data may contain sensitive or confidential information. In this paper, we consider the the -variable Hilbert-Schmidt independence criterion (dHSIC) in the context of differential privacy. Given the limiting distribution of the empirical estimate of dHSIC is complicated Gaussian chaos, constructing tests in the non-privacy regime is typically based on permutation and bootstrap. To detect joint dependence in privacy, we propose a dHSIC-based testing procedure by employing a differentially private permutation methodology. Our method enjoys privacy guarantee, valid level and pointwise consistency, while the bootstrap counterpart suffers inconsistent power. We further investigate the uniform power of the proposed test in dHSIC metric and metric, indicating that the proposed test attains the minimax optimal power across different privacy regimes. As a byproduct, our results also contain the pointwise and uniform power of the non-private permutation dHSIC, addressing an unsolved question remained in Pfister et al. (2018). Both numerical simulations and real data analysis on causal inference suggest our proposed test performs well empirically.

Paper Structure

This paper contains 41 sections, 19 theorems, 167 equations, 6 figures, 1 table, 2 algorithms.

Key Result

Lemma 2.1

Suppose that each algorithm $\mathcal{M}_i$ is $(\epsilon_i, \delta_i)$-differentially private for $i \in [m]$. Then, the composed algorithm $\mathcal{M}_{1:m}$ defined as $\mathcal{M}_{1:m} := (\mathcal{M}_1, \dots, \mathcal{M}_m)$ is $(\sum_{i=1}^m \epsilon_i, \sum_{i=1}^m \delta_i)$-differentiall

Figures (6)

  • Figure 1: Testing the joint independence in the setting in Simulation 2 with varying the privacy parameter $\epsilon$. We set the sample size $n=1000$ and level $\alpha=0.05$. We change the privacy parameter and the variance of the error as follows: (Left) Privacy parameter $\epsilon$ from $50/n$ to $10/\sqrt{n}$ and standard error $\sigma=2$. (Middle) Privacy parameter $\epsilon$ from $0.5$ to $1.5$ and standard error $\sigma=2$. (Right) Privacy parameter $\epsilon$ from $1$ to $25$ and standard error $\sigma=3$.
  • Figure 2: Testing the joint independence in the setting in Simulation 2 with varying the sample size $n$. We set level $\alpha=0.05$ and change the privacy parameter and the variance of the error as follows: (Left) Privacy parameter $\epsilon=10/\sqrt{n}$ and standard error $\sigma=2$. (Middle) Privacy parameter $\epsilon=1$ and standard error $\sigma=3$. (Right) Privacy parameter $\epsilon=\sqrt{n}/10$ and standard error $\sigma=3$.
  • Figure 3: Testing the joint independence among $d$ variables in Simulation 3 with varying the dimension $d$. We set level $\alpha=0.05$ and change the privacy parameter and the $\rho$ as follows: (Left) Privacy parameter $\epsilon=10/\sqrt{n}$ and standard error $\rho=0.3$. (Middle) Privacy parameter $\epsilon=1$ and standard error $\rho=0.3$. (Right) Privacy parameter $\epsilon=\sqrt{n}/5$ and standard error $\rho=0.2$.
  • Figure 4: Testing the joint independence among $3$ vectors or variables in Simulation 3 with varying the dimension $d$. We set level $\alpha=0.05$ and change the privacy parameter and the $\rho$ as follows: (Left) Privacy parameter $\epsilon=10/\sqrt{n}$ and standard error $\rho=0.3$. (Middle) Privacy parameter $\epsilon=1$ and standard error $\rho=0.3$. (Right) Privacy parameter $\epsilon=\sqrt{n}/10$ and standard error $\rho=0.2$.
  • Figure 5: Two candidate DAGs are presented. The left is referred to as DAG a, and the right is referred to as DAG b.
  • ...and 1 more figures

Theorems & Definitions (37)

  • Definition 2.1: Differential Privacy Dwork2014TheAF
  • Lemma 2.1: Composition
  • Definition 2.2: $\ell_p$-Sensitivity Kim2023DPP
  • Lemma 2.2: The Laplace Mechanism
  • Lemma 3.1
  • Proposition 4.1: Sensitivity of Empirical Permutation ${\rm{dHSIC}}$
  • Theorem 4.1: Properties of dpdHSIC Test
  • Theorem 4.2: Minimum Separation of $\phi_{{\rm{dpdHSIC}}}$
  • Theorem 4.3: Minimax Separation in dHSIC
  • Theorem 4.4: Minimum Separation of $\phi_{{\rm{dpdHSIC}}}$ over ${\mathcal{P}}^s_{L_2}$
  • ...and 27 more