Moment Expansions of the Energy Distance

Ian Langmore

Moment Expansions of the Energy Distance

Ian Langmore

TL;DR

The paper analyzes the squared energy distance ${\mathcal{D}^2}(X,Y)$ in the regime where the distributions are close, showing that the mean difference $\mu$ typically dominates the loss, with covariance differences entering at higher order via an averaged, dimension-dependent term. By expressing ${\mathcal{D}^2}$ through a Fourier-cumulant expansion and introducing a decay scale $\lambda$, the authors derive a leading-moments expansion: a main $O(1/\lambda)$ term proportional to $\|\mu\|^2$ and a secondary $O(1/\lambda^3)$ term involving $\Delta$, $\mu$, and skew cumulants, plus a controlled remainder. They specialize to multivariate Gaussians to obtain explicit forms and demonstrate that off-diagonal covariance contributions are suppressed by a factor of order $1/d$ under spherical symmetry, while the diagonal part contributes at order $O(d^{-1/2})$ to the mean term. The work also contrasts the energy-distance-based gradient with a standard covariance loss via a cosine similarity analysis, showing how dimension and correlation structure influence learning dynamics. Numerical verification across Gaussian and non-Gaussian distributions confirms the leading-moments predictions and highlights the regimes where the theory holds, offering practical guidance for using energy-distance-based losses in high-dimensional learning tasks.

Abstract

The energy distance is used to test distributional equality, and as a loss function in machine learning. While $D^2(X, Y)=0$ only when $X\sim Y$, the sensitivity to different moments is of practical importance. This work considers $D^2(X, Y)$ in the case where the distributions are close. In this regime, $D^2(X, Y)$ is more sensitive to differences in the means $\bar{X}-\bar{Y}$, than differences in the covariances $Δ$. This is due to the structure of the energy distance and is independent of dimension. The sensitivity to on versus off diagonal components of $Δ$ is examined when $X$ and $Y$ are close to isotropic. Here a dimension dependent averaging occurs and, in many cases, off diagonal correlations contribute significantly less. Numerical results verify these relationships hold even when distributional assumptions are not strictly met.

Moment Expansions of the Energy Distance

TL;DR

The paper analyzes the squared energy distance

in the regime where the distributions are close, showing that the mean difference

typically dominates the loss, with covariance differences entering at higher order via an averaged, dimension-dependent term. By expressing

through a Fourier-cumulant expansion and introducing a decay scale

, the authors derive a leading-moments expansion: a main

term proportional to

and a secondary

term involving

, and skew cumulants, plus a controlled remainder. They specialize to multivariate Gaussians to obtain explicit forms and demonstrate that off-diagonal covariance contributions are suppressed by a factor of order

under spherical symmetry, while the diagonal part contributes at order

to the mean term. The work also contrasts the energy-distance-based gradient with a standard covariance loss via a cosine similarity analysis, showing how dimension and correlation structure influence learning dynamics. Numerical verification across Gaussian and non-Gaussian distributions confirms the leading-moments predictions and highlights the regimes where the theory holds, offering practical guidance for using energy-distance-based losses in high-dimensional learning tasks.

Abstract

The energy distance is used to test distributional equality, and as a loss function in machine learning. While

only when

, the sensitivity to different moments is of practical importance. This work considers

in the case where the distributions are close. In this regime,

is more sensitive to differences in the means

, than differences in the covariances

. This is due to the structure of the energy distance and is independent of dimension. The sensitivity to on versus off diagonal components of

is examined when

and

are close to isotropic. Here a dimension dependent averaging occurs and, in many cases, off diagonal correlations contribute significantly less. Numerical results verify these relationships hold even when distributional assumptions are not strictly met.

Moment Expansions of the Energy Distance

TL;DR

Abstract

Moment Expansions of the Energy Distance

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (5)

Theorems & Definitions (7)