Table of Contents
Fetching ...

Nonparametric inference for ratios of densities via uniformly valid and powerful permutation tests

Alberto Bordino, Thomas B. Berrett

TL;DR

This work tackles nonparametric testing of density ratios by introducing a density ratio permutation test that remains uniformly valid under a simple null $H_0:g\propto r f$ through weight-aware, nonuniform permutation schemes. It unifies IPM-based discrepancy measures with kernel methods, introducing the shifted maximum mean discrepancy $\mathrm{MMD}_{r,k}$ to handle density-ratio shifts, and proves consistency and minimax optimality under Sobolev smoothness with bandwidth-appropriate scaling $\zeta = n^{2/(4s+d)}$. The framework extends to unknown ratios via training-data estimation and to conditional testing for covariate-shift and related transfer-learning causal-inference tasks, with finite-sample guarantees that tie type I error to estimation error through total-variation bounds. The authors validate theory with extensive simulations and real-data applications (e.g., New York frisk, Stroop, Two Moons, diamonds), and provide practical software (DRPTR-DRPT). Overall, the paper delivers a versatile, statistically rigorous tool for nonparametric density-ratio inference applicable to distributional shifts, transfer learning diagnostics, and causal-inference diagnostics, with strong theoretical guarantees and empirical support.

Abstract

We propose the density ratio permutation test, a hypothesis test that assesses whether the ratio between two densities is proportional to a known function based on independent samples from each distribution. The test uses an efficient Markov Chain Monte Carlo scheme to draw weighted permutations of the pooled data, yielding exchangeable samples and finite sample validity. For power, if the statistic is an integral probability metric, our procedure is consistent under mild assumptions on the defining function class; specializing to a reproducing kernel Hilbert space, we introduce the shifted maximum mean discrepancy and prove minimax optimality of our test when a normalized difference between the densities lies in a Sobolev ball. We extend to the case of an unknown density ratio by estimating it on an independent training sample and derive type~I error bounds in terms of the estimation error as well as power results. This allows adapting our method to conditional two sample testing, making it a versatile tool for assessing covariate-shift and related assumptions, which frequently arise in transfer learning and causal inference. Finally, we validate our theoretical findings through experiments on both simulated and real-world datasets.

Nonparametric inference for ratios of densities via uniformly valid and powerful permutation tests

TL;DR

This work tackles nonparametric testing of density ratios by introducing a density ratio permutation test that remains uniformly valid under a simple null through weight-aware, nonuniform permutation schemes. It unifies IPM-based discrepancy measures with kernel methods, introducing the shifted maximum mean discrepancy to handle density-ratio shifts, and proves consistency and minimax optimality under Sobolev smoothness with bandwidth-appropriate scaling . The framework extends to unknown ratios via training-data estimation and to conditional testing for covariate-shift and related transfer-learning causal-inference tasks, with finite-sample guarantees that tie type I error to estimation error through total-variation bounds. The authors validate theory with extensive simulations and real-data applications (e.g., New York frisk, Stroop, Two Moons, diamonds), and provide practical software (DRPTR-DRPT). Overall, the paper delivers a versatile, statistically rigorous tool for nonparametric density-ratio inference applicable to distributional shifts, transfer learning diagnostics, and causal-inference diagnostics, with strong theoretical guarantees and empirical support.

Abstract

We propose the density ratio permutation test, a hypothesis test that assesses whether the ratio between two densities is proportional to a known function based on independent samples from each distribution. The test uses an efficient Markov Chain Monte Carlo scheme to draw weighted permutations of the pooled data, yielding exchangeable samples and finite sample validity. For power, if the statistic is an integral probability metric, our procedure is consistent under mild assumptions on the defining function class; specializing to a reproducing kernel Hilbert space, we introduce the shifted maximum mean discrepancy and prove minimax optimality of our test when a normalized difference between the densities lies in a Sobolev ball. We extend to the case of an unknown density ratio by estimating it on an independent training sample and derive type~I error bounds in terms of the estimation error as well as power results. This allows adapting our method to conditional two sample testing, making it a versatile tool for assessing covariate-shift and related assumptions, which frequently arise in transfer learning and causal inference. Finally, we validate our theoretical findings through experiments on both simulated and real-world datasets.

Paper Structure

This paper contains 28 sections, 23 theorems, 208 equations, 5 figures, 2 tables, 2 algorithms.

Key Result

Theorem 1

Assume that $H_0: g \propto r \, f$ is true. Suppose that $\sigma^{(1)}, \ldots, \sigma^{(H)}$ are drawn i.i.d. from eq:sampling_1 and write $Z^{(h)} = Z_{\sigma^{(h)}}$ for all $h \in [H]$. Then the sequence $\left(Z, Z^{(1)}, \ldots, Z^{(H)} \right)$ is exchangeable. In particular, for any statist

Figures (5)

  • Figure 1: (a) Empirical power of (E1) for bivariate data on the unit square: purple, $r(x,y)=4xy$; orange, $r'(x,y)=2x$. Green and blue: alternative approaches based on (E2). (b) Estimated power of (E3) for binary data with varying sample sizes and $r=(1,3)$; $\tau := n/m$. (c) Empirical validation of Proposition \ref{['prop:robustness']} for Gaussian data with misspecified mean: red, theoretical bound $\alpha + \operatorname{TV}(\mathcal{N}(\mu,1)^{\otimes m}, \mathcal{N}(\nu,1)^{\otimes m}) = \alpha + 2\Phi(\sqrt{m}|\mu-\nu|/2) - 1$; purple, estimate of type I error over $300$ runs of (E1) based on $\widehat{r}$. In all panels, error bars denote $\pm 1$ standard error.
  • Figure 2: (a) Empirical power of (E1) in the causal inference setting, using estimates of the true density ratio $r(z) = \frac{1-\pi}{\pi}\exp\{\beta_0 + \beta^\top z + \gamma(\sin(10 z_1) + z_2 z_3)\}$ obtained via linear logistic (LL) and kernel logistic (KLR) regression; error bars show $\pm 1$ standard error. (b) $p$-values of (E3) for testing $w_1/w_0 = r\, b_1/b_0$ for $r \in \{0.1,\ldots,7\}$ on the New York-frisk datasets for 2011/2012 (blue) and 2015/2016 (brown); horizontal lines give Wald-type confidence intervals. (c) $p$-values of (E1) for testing $g(y)\propto e^{y/\eta} f(y)$ for $\eta \in \{0.01,\dots,0.3\}$ on the Stroop data; sensitivity analysis over $10$ random subsamples of size $n=m=100$.
  • Figure 3: (a) Empirical power of (E1) for testing $p(\theta,x)\propto \widehat{r}(\theta,x)\,p(x)\,p(\theta)$ with $\widehat{r}\in\{\widehat{r}_\mathrm{NRE},\widehat{r}_\mathrm{BNRE}\}$, the estimators in Delaunoy2022BNRE trained on $N_\mathrm{train}\in\{2^4,2^7,2^9,2^{12},2^{15}\}$ independent samples; error bars show $\pm 1$ standard error. (b) Receiver operating characteristic curves for classifiers distinguishing joint samples from product-of-marginals samples weighted by $\widehat{r}_\mathrm{NRE}$ and $\widehat{r}_\mathrm{BNRE}$; curves closer to the diagonal indicate better estimators.
  • Figure 4: Purple: simulation of (E1); here $r(x) = e^{-4x^2}$. Orange: same setting but with $r^\prime(x) = e^{-x^2}$. Green and blue: alternative approaches based on (E2). Error bars show $\pm 1$ standard errors.
  • Figure 5: (a) Empirical power of (E6) in a synthetic binary data setting for varying $r \in \{0.1, 0.5, 1, 2, 10\}$. (b) Comparison of (E4), (E5), (E6) in two four-dimensional settings.

Theorems & Definitions (49)

  • Theorem 1
  • Proposition 2
  • Remark 1
  • Remark 2
  • Theorem 3
  • Lemma 4
  • Lemma 5
  • Theorem 6
  • Lemma 7
  • Remark 3
  • ...and 39 more