Table of Contents
Fetching ...

DUAL: Learning Diverse Kernels for Aggregated Two-sample and Independence Testing

Zhijian Zhou, Xunye Tian, Liuhua Peng, Chao Lei, Antonin Schrab, Danica J. Sutherland, Feng Liu

TL;DR

An aggregated statistic that explicitly incorporates kernel diversity based on the covariance between different kernels is proposed, which motivates a testing framework with selection inference, which leverages information from the training phase to select kernels with strong individual performance from the learned diverse kernel pool.

Abstract

To adapt kernel two-sample and independence testing to complex structured data, aggregation of multiple kernels is frequently employed to boost testing power compared to single-kernel tests. However, we observe a phenomenon that directly maximizing multiple kernel-based statistics may result in highly similar kernels that capture highly overlapping information, limiting the effectiveness of aggregation. To address this, we propose an aggregated statistic that explicitly incorporates kernel diversity based on the covariance between different kernels. Moreover, we identify a fundamental challenge: a trade-off between the diversity among kernels and the test power of individual kernels, i.e., the selected kernels should be both effective and diverse. This motivates a testing framework with selection inference, which leverages information from the training phase to select kernels with strong individual performance from the learned diverse kernel pool. We provide rigorous theoretical statements and proofs to show the consistency on the test power and control of Type-I error, along with asymptotic analysis of the proposed statistics. Lastly, we conducted extensive empirical experiments demonstrating the superior performance of our proposed approach across various benchmarks for both two-sample and independence testing.

DUAL: Learning Diverse Kernels for Aggregated Two-sample and Independence Testing

TL;DR

An aggregated statistic that explicitly incorporates kernel diversity based on the covariance between different kernels is proposed, which motivates a testing framework with selection inference, which leverages information from the training phase to select kernels with strong individual performance from the learned diverse kernel pool.

Abstract

To adapt kernel two-sample and independence testing to complex structured data, aggregation of multiple kernels is frequently employed to boost testing power compared to single-kernel tests. However, we observe a phenomenon that directly maximizing multiple kernel-based statistics may result in highly similar kernels that capture highly overlapping information, limiting the effectiveness of aggregation. To address this, we propose an aggregated statistic that explicitly incorporates kernel diversity based on the covariance between different kernels. Moreover, we identify a fundamental challenge: a trade-off between the diversity among kernels and the test power of individual kernels, i.e., the selected kernels should be both effective and diverse. This motivates a testing framework with selection inference, which leverages information from the training phase to select kernels with strong individual performance from the learned diverse kernel pool. We provide rigorous theoretical statements and proofs to show the consistency on the test power and control of Type-I error, along with asymptotic analysis of the proposed statistics. Lastly, we conducted extensive empirical experiments demonstrating the superior performance of our proposed approach across various benchmarks for both two-sample and independence testing.

Paper Structure

This paper contains 26 sections, 10 theorems, 89 equations, 6 figures, 3 tables.

Key Result

Theorem 1

Let $\mathcal{K}$ be a collection of bounded characteristic kernels. Under the null hypothesis $H_0$, the test in eq:testing has type-I error bounded by $\alpha$, i.e., $\Pr_{H_0}(\mathfrak{h}(X,Y;\kappa)=1)\leq\alpha$, even non-asymptotically. Meanwhile, under any fixed alternative hypothesis $H_1$

Figures (6)

  • Figure 1: Comparing the test power of aggregating different sets of kernels in the two-sample testing problem on the BLOB dataset. The solid blue line shows the performance when aggregating all 20 kernels. The five dotted lines represent the test power when aggregating five different randomly selected subsets (each containing 5 kernels).
  • Figure 2: Test power versus samples size on BLOB dataset. (a) The performance of four different individual kernels with different bandwidths. (b) The performance of aggregating the first kernel $\kappa_1$ with each of the kernels. The diversity between $\kappa_1$ and $\kappa_4$ is the largest, and that between $\kappa_1$ and $\kappa_2$ is the smallest.
  • Figure 3: Two-sample $(a\!-\!c)$ experiments on dataset BLOB, MNIST and ImageNet; and independence $(e\!-\!f)$ experiments on dataset Higgs, MNIST and CIFAR10. The power results are averaged over 1,000 repetitions and the type-I error are all controlled under the significant level $\alpha=0.05$, where the type-I error experiments can be found in the supplementary material.
  • Figure 4: Ablation Study on the effectiveness of diversity and selection inference. $(a\!-\!d)$ are ablation study for MMD-DUAL; $(e\!-\!h)$ are ablation study for HSIC-DUAL. $(a,e)$ Test power for model variants: AU represents simple Aggregated $U$-Statistics; S represents selection inference technique; D represents considering diversity into AU; AU+S+D refers to our proposed DUAL.
  • Figure 5: Two-sample testing: $(a)$ Type I error checking experiments on dataset BLOB, MNIST and ImageNet; and Independence testing: $(b)$ Type I error checking experiments on dataset Higgs, MNIST and CIFAR10. The Type I error results are averaged over 1,000 repetitions under the significant level $\alpha=0.05$. $n_1 - n_6$ refers to a set of six sample sizes associated with each dataset presented in Figure \ref{['fig:main_results']}, where the specific sample sizes vary across datasets.
  • ...and 1 more figures

Theorems & Definitions (23)

  • Remark 1
  • Remark 2
  • Remark 3
  • Theorem 1
  • Theorem 2
  • Lemma 3
  • Corollary 4
  • Theorem 5
  • Theorem 6
  • Corollary 7
  • ...and 13 more