Table of Contents
Fetching ...

Random Field Representations of Kernel Distances

Ian Langmore

TL;DR

This work reframes kernel-based distances between probability measures as expectations over random fields, showing ${\mathcal{D}^2}(μ-ν) = \mathbb{E}_U{\langle U, μ-ν \rangle^2}$ and linking this view to the conventional kernel/MMD form and to Fourier/Wiener representations. It develops a family of distance notions by replacing Brownian motion with continuous fractional fields ${B^H}$ and Gaussian free fields (GFF), yielding generalized energy distances such as the Dirichlet energy distance, with explicit spectral representations: ${\mathcal{D}^2}(μ-ν) \propto \int \frac{|\hat{μ}(ω) - \hat{ν}(ω)|^2}{\|ω\|^{d+2H}} dω$ and, for GFF, distances tied to Dirichlet energy. The paper provides conditions under which these distances are characteristic, analyzes sample-path assumptions (stationary increments, fractal scaling, Banach-space embedding), and gives support theorems ensuring the inducing fields are dense enough to distinguish measures. It also demonstrates practical implications through signal-to-noise analyses, discrete-space examples, and additive Brownian motion, guiding finite-sample estimation and informing when and how to use fractional and Gaussian fields to capture different moment- and tail-structure in distributions.

Abstract

Positive semi-definite kernels are used to induce pseudo-metrics, or ``distances'', between measures. We write these as an expected quadratic variation of, or expected inner product between, a random field and the difference of measures. This alternate viewpoint offers important intuition and interesting connections to existing forms. Metric distances leading to convenient finite sample estimates are shown to be induced by fields with dense support, stationary increments, and scale invariance. The main example of this is energy distance. We show that the common generalization preserving continuity is induced by fractional Brownian motion. We induce an alternate generalization with the Gaussian free field, formally extending the Cramér-von Mises distance. Pathwise properties give intuition about practical aspects of each. This is demonstrated through signal to noise ratio studies.

Random Field Representations of Kernel Distances

TL;DR

This work reframes kernel-based distances between probability measures as expectations over random fields, showing and linking this view to the conventional kernel/MMD form and to Fourier/Wiener representations. It develops a family of distance notions by replacing Brownian motion with continuous fractional fields and Gaussian free fields (GFF), yielding generalized energy distances such as the Dirichlet energy distance, with explicit spectral representations: and, for GFF, distances tied to Dirichlet energy. The paper provides conditions under which these distances are characteristic, analyzes sample-path assumptions (stationary increments, fractal scaling, Banach-space embedding), and gives support theorems ensuring the inducing fields are dense enough to distinguish measures. It also demonstrates practical implications through signal-to-noise analyses, discrete-space examples, and additive Brownian motion, guiding finite-sample estimation and informing when and how to use fractional and Gaussian fields to capture different moment- and tail-structure in distributions.

Abstract

Positive semi-definite kernels are used to induce pseudo-metrics, or ``distances'', between measures. We write these as an expected quadratic variation of, or expected inner product between, a random field and the difference of measures. This alternate viewpoint offers important intuition and interesting connections to existing forms. Metric distances leading to convenient finite sample estimates are shown to be induced by fields with dense support, stationary increments, and scale invariance. The main example of this is energy distance. We show that the common generalization preserving continuity is induced by fractional Brownian motion. We induce an alternate generalization with the Gaussian free field, formally extending the Cramér-von Mises distance. Pathwise properties give intuition about practical aspects of each. This is demonstrated through signal to noise ratio studies.

Paper Structure

This paper contains 21 sections, 7 theorems, 78 equations, 4 figures.

Key Result

Proposition 2.1

Let $\mathcal{M}\subset{\mathcal{F}_0}({{\mathbb{R}}^d})$. Suppose Gaussian field $U$ is stationary with spectral density satisfying align:spectral-density-assumptions-stationary-fields, or has stationary increments with density satisfying align:spectral-density-assumptions-stationary-increment-fiel

Figures (4)

  • Figure 1: Traces of fBM: Smaller Hurst index $H$ results in a field that decorrelates quickly, and approximates a (random) constant plus stationary noise. $H=0.5$ corresponds to Brownian motion. As $H\to1$, the traces become linear with random slope.
  • Figure 2: Moments of fractional motion: Shows that, for smaller $H < 1/2$, ${\langle B^H,t^k\rangle}$ is larger for even $k$. For $H > 1/2$, ${\langle B^H,t^k\rangle}$ is larger for odd $k$.
  • Figure 3: Signal to noise sweep comparing continuous and Dirichlet energies: SNR as a function of multivariate Student's T dof ${f}$. Computed with 4M samples using the kernel forms of the scores. Specfically, $H=1/2$ and $\alpha=5$ from section \ref{['section:summary-of-fractional-distances']}. The Dirichlet energy is well defined even when the distribution has non-finite mean. The kernel form of continuous energy requires degrees of freedom ${f} > 2H=1$, otherwise results became NaN in 32-bit.
  • Figure 4: SNR Sweep: Shows that the signal, ${\mathcal{D}^2}(\mu-\nu; B^H)$ is the differentiating factor in SNR calculations, since the standard deviation is mostly perturbation independent. The signal shape (with respect to $H$) is as expected due to the path-wise heuristics from section \ref{['section:SNR-and-H']}.

Theorems & Definitions (12)

  • Definition 1.1
  • Proposition 2.1
  • Lemma 2.2
  • proof
  • Corollary 3.1
  • Theorem 6.1
  • Corollary 6.2
  • Lemma 6.3
  • proof
  • Lemma 6.4
  • ...and 2 more