Table of Contents
Fetching ...

The Exact Equivalence of Distance and Kernel Methods for Hypothesis Testing

Cencheng Shen, Joshua T. Vogelstein

TL;DR

The bijective transformation better preserves the similarity structure, allows distance correlation and Hilbert-Schmidt independence criterion to be always the same for hypothesis testing, streamlines the code base for implementation, and enables a rich literature of distance-based and kernel-based methodologies to directly communicate with each other.

Abstract

Distance-based tests, also called "energy statistics", are leading methods for two-sample and independence tests from the statistics community. Kernel-based tests, developed from "kernel mean embeddings", are leading methods for two-sample and independence tests from the machine learning community. A fixed-point transformation was previously proposed to connect the distance methods and kernel methods for the population statistics. In this paper, we propose a new bijective transformation between metrics and kernels. It simplifies the fixed-point transformation, inherits similar theoretical properties, allows distance methods to be exactly the same as kernel methods for sample statistics and p-value, and better preserves the data structure upon transformation. Our results further advance the understanding in distance and kernel-based tests, streamline the code base for implementing these tests, and enable a rich literature of distance-based and kernel-based methodologies to directly communicate with each other.

The Exact Equivalence of Distance and Kernel Methods for Hypothesis Testing

TL;DR

The bijective transformation better preserves the similarity structure, allows distance correlation and Hilbert-Schmidt independence criterion to be always the same for hypothesis testing, streamlines the code base for implementation, and enables a rich literature of distance-based and kernel-based methodologies to directly communicate with each other.

Abstract

Distance-based tests, also called "energy statistics", are leading methods for two-sample and independence tests from the statistics community. Kernel-based tests, developed from "kernel mean embeddings", are leading methods for two-sample and independence tests from the machine learning community. A fixed-point transformation was previously proposed to connect the distance methods and kernel methods for the population statistics. In this paper, we propose a new bijective transformation between metrics and kernels. It simplifies the fixed-point transformation, inherits similar theoretical properties, allows distance methods to be exactly the same as kernel methods for sample statistics and p-value, and better preserves the data structure upon transformation. Our results further advance the understanding in distance and kernel-based tests, streamline the code base for implementing these tests, and enable a rich literature of distance-based and kernel-based methodologies to directly communicate with each other.

Paper Structure

This paper contains 11 sections, 10 theorems, 45 equations, 3 figures, 1 table.

Key Result

Theorem 1

Given a metric $d(\cdot,\cdot)$ and a fixed point $z$, the bijective induced kernel and the fixed-point induced kernel are related via for the shift function $f(x_{i})=\max\limits_{s,t \in [n]}(d(x_s,x_t))/2-d(x_{i},z)$. Given a positive definite and translation invariant kernel $k(\cdot,\cdot)$, the bijective induced metric and the fixed-point induced metric are the same, i.e., for any $x_i, x_j

Figures (3)

  • Figure 1: Visualize linear, spiral, sine, and independent cloud relationships at $n=100$ with noise.
  • Figure 2: Generate $\{w_i\}$ from a 2D Gaussian mixture of three components at $n=1000$. Starting with Euclidean distance, the first row visualizes the bijective induced kernel matrix and the spectral clustering result, and the second row visualizes the fixed-point induced kernel matrix and the spectral clustering result.
  • Figure 3: Comparing testing power of Hilbert-Schmidt independence criterion and Kernel multiscale graph correlation for four simulations.

Theorems & Definitions (19)

  • Definition 1
  • Definition 2
  • Definition 3
  • Definition 4
  • Theorem 1
  • Theorem 2
  • Theorem 3
  • Theorem 4
  • Theorem 5
  • Theorem 5
  • ...and 9 more